Intercepting Network Requests with Puppeteer
Puppeteer is a powerful tool for capturing background requests made by a website. Websites often send multiple background requests, which tools like cURL, Wget, or Selenium may not fully capture. With Puppeteer, we can intercept, collect, and filter all network requests. Below is a code snippet to extract Cloudflare URLs from requests made by flightradar24.com.
const puppeteer = require('puppeteer')
try {
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
var cloudflare_urls = [];
page.on('request', request => {
if(request.url().includes('cloudflare')) {
cloudflare_urls.push(request.url());
}
});
await page.goto('https://www.flightradar24.com/data/airlines/arz');
await browser.close();
console.log(cloudflare_urls);
})()
} catch (err) {
console.error(err);
}
In the Puppeteer snippet above, we navigate to the URL ‘https://www.flightradar24.com/data/airlines/arz’ and capture all requests directed to Cloudflare.
[ 'https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css', 'https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.3.7/css/bootstrap.min.css', 'https://cdnjs.cloudflare.com/ajax/libs/toastr.js/2.1.4/toastr.min.css', 'https://cdnjs.cloudflare.com/ajax/libs/moment.js/2.24.0/moment.min.js', 'https://cdnjs.cloudflare.com/ajax/libs/moment-duration-format/2.2.2/moment-duration-format.min.js', 'https://cdnjs.cloudflare.com/ajax/libs/URI.js/1.19.1/URI.min.js', 'https://cdnjs.cloudflare.com/ajax/libs/localforage/1.4.3/localforage.min.js', 'https://cdnjs.cloudflare.com/ajax/libs/highcharts/4.1.10/highcharts.js', 'https://cdnjs.cloudflare.com/ajax/libs/toastr.js/2.1.4/toastr.min.js', 'https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/fonts/fontawesome-webfont.woff2?v=4.7.0' ]
This can be extremely useful for identifying AJAX or other background requests made by a website.