pydoll
Pydoll is a Python library for browser automation that uses the Chrome DevTools Protocol (CDP) directly, designed to be undetectable by anti-bot systems. Unlike Selenium-based tools, Pydoll does not use WebDriver and avoids the common detection vectors that anti-bot systems look for.
Key features include:
- Native CDP communication Connects directly to Chrome/Chromium via CDP websocket without intermediary drivers, avoiding the automation flags and fingerprints that WebDriver-based tools leave behind.
- Event-driven architecture Built around an async event system that can listen for and react to browser events like network requests, console messages, and DOM changes.
- Network interception Can intercept, modify, and mock network requests and responses, useful for blocking unnecessary resources or modifying API responses during scraping.
- Async-first design Fully asynchronous API built on Python's asyncio for efficient concurrent automation.
- Clean API Provides a high-level, Pythonic API for common browser automation tasks while still allowing direct CDP command execution for advanced use cases.
- Multi-browser support Can manage multiple browser instances and pages concurrently.
Pydoll fills a similar niche to nodriver and camoufox — browser automation with a focus on avoiding detection — but takes a different approach by providing more granular control over CDP communication and network interception.
Highlights
Example Use
```python import asyncio from pydoll.browser import Chrome from pydoll.constants import By
async def main(): async with Chrome() as browser: # Open a new page page = await browser.new_page() await page.go_to("https://example.com")
# Find and interact with elements
search_input = await page.find_element(By.CSS, "input[name='q']")
await search_input.type_text("web scraping")
submit_btn = await page.find_element(By.CSS, "button[type='submit']")
await submit_btn.click()
# Wait for results and extract content
await page.wait_element(By.CSS, ".results")
results = await page.find_elements(By.CSS, ".result-item")
for result in results:
title = await result.get_text()
print(title)
# Network interception example
await page.enable_network_interception()
# intercept and analyze API calls made by the page
asyncio.run(main()) ```