• Home
  • Tutorial
  • AI
  • WordPress
  • UI/UX
  • Web crawling
The TechnoTreat

Type and hit Enter to search

  • Home
  • Tutorial
  • AI
  • WordPress
  • UI/UX
  • Web crawling
puppeteer-browser-automation
TutorialWeb crawling

Intercepting Network Requests with Puppeteer

Thetechnotreat
November 16, 2020 One Min Read
23 Views
0 Comments

Puppeteer is a powerful tool for capturing background requests made by a website. Websites often send multiple background requests, which tools like cURL, Wget, or Selenium may not fully capture. With Puppeteer, we can intercept, collect, and filter all network requests. Below is a code snippet to extract Cloudflare URLs from requests made by flightradar24.com.

const puppeteer = require('puppeteer')
try {
	(async () => {
		const browser = await puppeteer.launch();
		const page = await browser.newPage();
		var cloudflare_urls = [];
		page.on('request', request => {
			if(request.url().includes('cloudflare')) {
				cloudflare_urls.push(request.url());
			}
		});
		await page.goto('https://www.flightradar24.com/data/airlines/arz');
		await browser.close();
		console.log(cloudflare_urls);
	})()
} catch (err) {
	console.error(err);
}

In the Puppeteer snippet above, we navigate to the URL ‘https://www.flightradar24.com/data/airlines/arz’ and capture all requests directed to Cloudflare.

[
  'https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css',
  'https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.3.7/css/bootstrap.min.css',
  'https://cdnjs.cloudflare.com/ajax/libs/toastr.js/2.1.4/toastr.min.css',
  'https://cdnjs.cloudflare.com/ajax/libs/moment.js/2.24.0/moment.min.js',
  'https://cdnjs.cloudflare.com/ajax/libs/moment-duration-format/2.2.2/moment-duration-format.min.js',
  'https://cdnjs.cloudflare.com/ajax/libs/URI.js/1.19.1/URI.min.js',
  'https://cdnjs.cloudflare.com/ajax/libs/localforage/1.4.3/localforage.min.js',
  'https://cdnjs.cloudflare.com/ajax/libs/highcharts/4.1.10/highcharts.js',
  'https://cdnjs.cloudflare.com/ajax/libs/toastr.js/2.1.4/toastr.min.js',
  'https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/fonts/fontawesome-webfont.woff2?v=4.7.0'
]

This can be extremely useful for identifying AJAX or other background requests made by a website.

Tags:

ajaxcrawlingCurlNodejsPuppeteerscrapingSeleniumWget

Share Article

Follow Me Written By

Thetechnotreat

Other Articles

Web scraping - thetechnotreat.com
Previous

Web scraping with Python 3, Requests and Beautifulsoup (bs4)

Next

AI Agents Revolutionizing Softwаrе Dеvеlopmеnt

Next
February 28, 2025

AI Agents Revolutionizing Softwаrе Dеvеlopmеnt

Previous
July 10, 2019

Web scraping with Python 3, Requests and Beautifulsoup (bs4)

Web scraping - thetechnotreat.com

No Comment! Be the first one.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

All Right Reserved! | Privacy policy