Browser Libraries
Beyond the core browser automation tools (Playwright, Puppeteer, Selenium), a growing ecosystem of libraries focuses on specialized browser-based scraping needs: anti-detection, AI-powered control, and stealth enhancement.
Anti-Detect Browsers
These libraries are designed to make browser automation invisible to anti-bot systems. They address detection vectors like navigator.webdriver, CDP fingerprinting, TLS fingerprinting, and canvas/WebGL fingerprinting.
| Library | Language | Base Browser | Approach |
|---|---|---|---|
| nodriver | Python | Chrome | Direct CDP, no WebDriver dependency |
| camoufox | Python | Firefox | C++-level patches, realistic fingerprints |
| pydoll | Python | Chrome | CDP-native, network interception |
| undetected-chromedriver | Python | Chrome | Patched chromedriver, Selenium-based |
| puppeteer-extra + stealth | NodeJS | Chrome | Plugin framework with stealth patches |
| selenium-driverless | Python | Chrome | Selenium API without chromedriver binary |
Choosing an anti-detect library:
- nodriver is the recommended default for Python - fast, modern, and maintained by the undetected-chromedriver author.
- camoufox is best when you need Firefox specifically (some anti-bot systems treat Firefox differently than Chrome).
- puppeteer-extra with stealth plugin is the standard for JavaScript/NodeJS.
- selenium-driverless is useful when you need Selenium's API compatibility without chromedriver.
AI Browser Agents
A new category of tools uses large language models to control browsers through natural language instructions. Instead of writing selectors, you describe what you want and the AI navigates, clicks, and extracts.
| Library | Language | Approach |
|---|---|---|
| browser-use | Python | LLM agent + Playwright, multi-step task automation |
| stagehand | NodeJS | act/extract/observe primitives, TypeScript, Browserbase |
| skyvern | Python | LLM + computer vision, screenshot-based interaction |
| crawl4ai | Python | LLM extraction with markdown conversion |
| scrapegraphai | Python | Graph-based LLM pipelines with Pydantic schemas |
When to use AI browser agents:
Scraping diverse sites with varying layouts (no single CSS selector works)
Rapid prototyping without studying page structure
Complex multi-step workflows (login → navigate → fill form → extract)
High-volume production scraping (LLM API cost per page)
Sites with stable, simple HTML (traditional selectors are cheaper and faster)
TLS/HTTP Fingerprint Libraries
A different approach to avoiding detection operates at the HTTP connection level rather than the browser level. These libraries impersonate real browsers' TLS handshakes, HTTP/2 settings, and header ordering.
| Library | Language | Approach |
|---|---|---|
| curl-cffi | Python | cURL-based, Chrome/Firefox/Safari impersonation |
| primp | Python | Rust-powered, lightweight browser fingerprint matching |
| hrequests | Python | requests-like API with TLS fingerprinting |
| curl-impersonate | Python | Low-level cURL with browser TLS patches |
These are useful when you don't need a full browser but standard HTTP clients (requests, httpx) get blocked due to TLS/HTTP fingerprinting. They are much faster and lighter than running a browser.
Comparison
| Need | Recommended Approach |
|---|---|
| JavaScript rendering required | Playwright, Puppeteer, or anti-detect browser |
| Blocked by TLS/HTTP fingerprinting | curl-cffi, primp |
| Blocked by browser fingerprinting | nodriver, camoufox, puppeteer-extra |
| Diverse sites, varying layouts | AI browser agent (browser-use, stagehand) |
| Rapid prototyping | crawl4ai, scrapegraphai |
| High-volume production | Playwright or curl-cffi + traditional selectors |
See also: Anti-Bot Protections for a guide to Cloudflare, DataDome, Akamai and other WAFs, Browser Automation for Playwright/Puppeteer/Selenium basics, Frameworks for full scraping frameworks, and Web Scrapers for ready-to-use scrapers for popular websites.