How to make POST, PUT and DELETE requests using Puppeteer?

Posted on

Making POST, PUT, and DELETE requests is a crucial web scraping and web testing technique. Still, this functionality is not included in Puppeteer’s API as a separate function.

Let’s check out the workaround for this situation and create a helper function to fix this out.



Why do we need POST while web scraping?

POST request is one of several available HTTP request types. By design, this method is used to send data to a web server for processing and possible storage.

POST requests usage is one of the main ways to send form data during login or registration. It is also one of the ways to send any data to the web server.

Earlier, one of the main implementation patterns of login or registration was to send the form data with the required authorization parameters via POST request and get the protected content as a response to this request (along with cookies to avoid re-entering the authentication and authorization data).

Nowadays SPAs (Single Page Applications) also use POST requests to send data to API, but such requests usually return only necessary data to update web page and not the whole page.

Thus, many sites use POST requests for client-server communication and this requires the ability of sending POST requests while web scraping.

Unfortunately, Puppeteer developers haven’t introduced the native way of making requests other than GET, but it’s not a big deal for us to create a workaround.



Interception of the initial request

The idea behind our approach is quite simple – we need to change the request type while opening the page, so we can send POST data along with opening a page.

To do that, we have to intercept the request using page.on('request') handler.

We’re going to use HTTPBin which can help us with our solution testing.

Let’s check out the simple JS snippet which just opens HTTPBin’s POST endpoint:

const puppeteer = require('puppeteer');
const TARGET_URL = 'https://httpbin.org/post';

(async () => {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    await page.goto(TARGET_URL);
    console.log(await page.content());
})();
Enter fullscreen mode

Exit fullscreen mode

The result is, definitely, not what we’re trying to achieve:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"><html><head><title>405 Method Not Allowed</title>
</head><body><h1>Method Not Allowed</h1>
<p>The method is not allowed for the requested URL.</p>
</body></html>
Enter fullscreen mode

Exit fullscreen mode

So, let’s add a request interception:

const puppeteer = require('puppeteer');
const TARGET_URL = 'https://httpbin.org/post';
const POST_JSON = { hello: 'I like ScrapingAnt' };

(async () => {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    await page.setRequestInterception(true);
    page.once('request', request => {
        request.continue({ method: 'POST', postData: JSON.stringify(POST_JSON), headers: request.headers });
    });
    await page.goto(TARGET_URL);
    console.log(await page.content());
})();
Enter fullscreen mode

Exit fullscreen mode

This time our POST request has been successfully executed:

<html><head></head><body><pre style="word-wrap: break-word; white-space: pre-wrap;">{
"args": {},
"data": "{"hello"":""I like ScrapingAnt""}"",

Leave a Reply

Your email address will not be published.