Skip to content

curl-impersonatevscurl-cffi

MIT 79 1 5,944
Feb 23 2022 v0.6.1(2024-03-02 18:08:29 ago)
1,751 2 34 MIT
Feb 23 2022 594.9 thousand (month) 0.7.1(2024-07-13 09:07:25 ago)

Curl-impersonate is a special build of libcurl and cURL HTTP client that impersonates the four major browsers: - Google Chrome - Microsoft Edge - Safari - Firefox Curl-impersonate achieves this by patching TLS and HTTP fingerprints to be identical to that of one of these real browsers.

Unlike other HTTP clients curl-impersonate can bypass TSL and HTTP fingerprinting and detection techniques though it does not implement anything for Javascript fingerprint or bypass.

Curl-cffi is a Python library for implementing curl-impersonate which is a HTTP client that appears as one of popular web browsers like: - Google Chrome - Microsoft Edge - Safari - Firefox Unlike requests and httpx which are native Python libraries, curl-cffi uses cURL and inherits it's powerful features like extensive HTTP protocol support and detection patches for TLS and HTTP fingerprinting.

Using curl-cffi web scrapers can bypass TLS and HTTP fingerprinting.

Highlights


bypasshttp2tls-fingerprinthttp-fingerprintlow-level
bypasshttp2tls-fingerprinthttp-fingerprintsyncasync

Example Use


curl-impersonate installs itself under `curl_` terminal commands like `curl_chrome116`: ```shell $ curl_chrome116 https://www.wikipedia.org ``` To use it in HTTP client libraries that use `libcurl` replace curl path with one of these. To use it in python directly see curl-cffi Python package
curl-cffi can be accessed as low-level curl client as well as an easy high-level HTTP client: ```python from curl_cffi import requests response = requests.get('https://httpbin.org/json') print(response.json()) # or using sessions session = requests.Session() response = session.get('https://httpbin.org/json') # also supports async requests using asyncio import asyncio from curl_cffi.requests import AsyncSession urls = [ "http://httpbin.org/html", "http://httpbin.org/html", "http://httpbin.org/html", ] async with AsyncSession() as s: tasks = [] for url in urls: task = s.get(url) tasks.append(task) # scrape concurrently: responses = await asyncio.gather(*tasks) # also supports websocket connections from curl_cffi.requests import Session, WebSocket def on_message(ws: WebSocket, message): print(message) with Session() as s: ws = s.ws_connect( "wss://api.gemini.com/v1/marketdata/BTCUSD", on_message=on_message, ) ws.run_forever() ```

Alternatives / Similar


Was this page helpful?