puppeteer-extravscurl-cffi
Puppeteer-extra is a modular plugin framework that wraps Puppeteer (and Playwright) to add extra functionality through a plugin system. It acts as a drop-in replacement for Puppeteer while enabling powerful extensions for stealth, captcha solving, ad blocking, and more.
The most popular plugins include:
- puppeteer-extra-plugin-stealth Applies various evasion techniques to make the automated browser harder to detect. Patches common detection vectors like navigator.webdriver, Chrome.runtime, WebGL renderer strings, and more. This is the most widely used Puppeteer stealth solution.
- puppeteer-extra-plugin-recaptcha Automatically detects and solves reCAPTCHA and hCaptcha challenges using third-party solving services (2captcha, anti-captcha).
- puppeteer-extra-plugin-adblocker Blocks ads and trackers to speed up page loading and reduce bandwidth usage during scraping.
- puppeteer-extra-plugin-anonymize-ua Randomizes the User-Agent string to avoid fingerprinting.
Key features of the framework:
- Drop-in replacement
Use
puppeteer-extrainstead ofpuppeteerin your imports - existing code works without changes. - Plugin composition Multiple plugins can be stacked and they work together without conflicts.
- Playwright support
The same plugin system works with Playwright via
playwright-extra. - Community plugins Active community creating and maintaining plugins for various use cases.
Puppeteer-extra is the go-to solution for adding stealth capabilities to Puppeteer-based scrapers without rewriting existing code.
Curl-cffi is a Python library for implementing curl-impersonate which is a
HTTP client that appears as one of popular web browsers like:
- Google Chrome
- Microsoft Edge
- Safari
- Firefox
Unlike requests and httpx which are native Python libraries, curl-cffi uses cURL and inherits it's powerful features
like extensive HTTP protocol support and detection patches for TLS and HTTP fingerprinting.
Using curl-cffi web scrapers can bypass TLS and HTTP fingerprinting.