Skip to content

primpvsbotasaurus

MIT 3 1 504
7.1 million (month) Jun 01 2024 1.2.2(2026-04-03 07:11:15 ago)
4,321 5 52 MIT
Oct 01 2023 35.5 thousand (month) 4.0.97(2026-01-06 07:45:54 ago)

Primp is a Python HTTP client that impersonates real web browsers by replicating their TLS fingerprints, HTTP/2 settings, and header ordering. It is a lightweight alternative to curl-cffi for bypassing TLS and HTTP fingerprinting-based bot detection.

Key features include:

  • Browser impersonation Can impersonate Chrome, Firefox, Safari, Edge, and OkHttp clients by replicating their exact TLS fingerprints (JA3/JA4), HTTP/2 frame settings, header ordering, and other connection-level characteristics.
  • HTTP/2 support Full HTTP/2 support with configurable settings that match real browser behavior.
  • Lightweight Smaller and simpler than curl-cffi while providing similar impersonation capabilities. Built on Rust for performance.
  • Familiar API Provides a requests-like API with Session support, making it easy to adopt for developers familiar with the Python requests library.
  • Proxy support HTTP and SOCKS5 proxy support with authentication.
  • Cookie management Automatic cookie handling across requests within a session.

Primp fills a similar niche to curl-cffi and hrequests — HTTP clients designed to avoid TLS/HTTP fingerprinting — but takes a Rust-powered approach for better performance. It is particularly useful when you need to bypass bot detection that relies on connection-level fingerprinting without using a full browser.

Botasaurus is an all-in-one Python web scraping framework that combines browser automation, anti-detection, and scaling features into a single package. It aims to simplify the entire web scraping workflow from development to deployment.

Key features include:

  • Anti-detect browser Ships with a stealth-patched browser that passes common bot detection tests. Automatically handles fingerprinting, user agent rotation, and other anti-detection measures.
  • Decorator-based API Uses Python decorators (@browser, @request) to define scraping tasks, making code clean and easy to organize.
  • Built-in parallelism Easy parallel execution of scraping tasks across multiple browser instances with configurable concurrency.
  • Caching Built-in caching layer to avoid re-scraping pages during development and debugging.
  • Profile persistence Can save and reuse browser profiles (cookies, localStorage) across scraping sessions for maintaining login state.
  • Output handling Automatic output to JSON, CSV, or custom formats with built-in data filtering.
  • Web dashboard Includes a web UI for monitoring scraping progress, viewing results, and managing tasks.

Botasaurus is designed for developers who want a batteries-included framework that handles anti-detection automatically, without needing to manually configure stealth settings or manage browser fingerprints.

Highlights


bypasstls-fingerprinthttp-fingerprinthttp2fast
anti-detectstealthlarge-scale

Example Use


```python import primp # Create a session that impersonates Chrome session = primp.Session(impersonate="chrome_131") # Make requests - TLS fingerprint matches real Chrome response = session.get("https://example.com") print(response.status_code) print(response.text) # POST with JSON data response = session.post( "https://api.example.com/data", json={"key": "value"}, ) # With proxy session = primp.Session( impersonate="firefox_133", proxy="http://user:pass@proxy.example.com:8080", ) response = session.get("https://example.com") # Different browser impersonation profiles for browser in ["chrome_131", "firefox_133", "safari_18", "edge_131"]: session = primp.Session(impersonate=browser) resp = session.get("https://tls.peet.ws/api/all") print(f"{browser}: {resp.json()['ja3_hash']}") ```
```python from botasaurus.browser import browser, Driver from botasaurus.request import request, Request # Browser-based scraping with anti-detection @browser(parallel=3, cache=True) def scrape_products(driver: Driver, url: str): driver.get(url) # Wait for content to load driver.wait_for_element(".product-list") # Extract product data products = [] for el in driver.select_all(".product-card"): products.append({ "name": el.select(".product-name").text, "price": el.select(".product-price").text, "url": el.select("a").get_attribute("href"), }) return products # HTTP-based scraping (no browser needed) @request(parallel=5, cache=True) def scrape_api(req: Request, url: str): response = req.get(url) return response.json() # Run the scraper results = scrape_products( ["https://example.com/page/1", "https://example.com/page/2"] ) ```

Alternatives / Similar


Was this page helpful?