curl-impersonatevscurl-cffi

MIT 68 1 4,221

Feb 23 2022 v0.6.1(1 year, 5 months ago)

1,751 2 34 MIT

Feb 23 2022 594.9 thousand (month) 0.7.1(1 year, 1 month ago)

Curl-impersonate is a special build of libcurl and cURL HTTP client that impersonates the four major browsers: - Google Chrome - Microsoft Edge - Safari - Firefox Curl-impersonate achieves this by patching TLS and HTTP fingerprints to be identical to that of one of these real browsers.

Unlike other HTTP clients curl-impersonate can bypass TSL and HTTP fingerprinting and detection techniques though it does not implement anything for Javascript fingerprint or bypass.

Curl-cffi is a Python library for implementing curl-impersonate which is a HTTP client that appears as one of popular web browsers like: - Google Chrome - Microsoft Edge - Safari - Firefox Unlike requests and httpx which are native Python libraries, curl-cffi uses cURL and inherits it's powerful features like extensive HTTP protocol support and detection patches for TLS and HTTP fingerprinting.

Using curl-cffi web scrapers can bypass TLS and HTTP fingerprinting.

Highlights

bypasshttp2tls-fingerprinthttp-fingerprintlow-level

bypasshttp2tls-fingerprinthttp-fingerprintsyncasync

Example Use

curl-impersonate installs itself under `curl_` terminal commands like `curl_chrome116`:

$ curl_chrome116 https://www.wikipedia.org

To use it in HTTP client libraries that use `libcurl` replace curl path with one of these. To use it in python directly see curl-cffi Python package

curl-cffi can be accessed as low-level curl client as well as an easy high-level HTTP client:

from curl_cffi import requests

response = requests.get('https://httpbin.org/json')
print(response.json())

# or using sessions
session = requests.Session()
response = session.get('https://httpbin.org/json')

# also supports async requests using asyncio
import asyncio
from curl_cffi.requests import AsyncSession

urls = [
  "http://httpbin.org/html",
  "http://httpbin.org/html",
  "http://httpbin.org/html",
]

async with AsyncSession() as s:
    tasks = []
    for url in urls:
        task = s.get(url)
        tasks.append(task)
    # scrape concurrently:
    responses = await asyncio.gather(*tasks)

# also supports websocket connections
from curl_cffi.requests import Session, WebSocket

def on_message(ws: WebSocket, message):
    print(message)

with Session() as s:
    ws = s.ws_connect(
        "wss://api.gemini.com/v1/marketdata/BTCUSD",
        on_message=on_message,
    )
    ws.run_forever()

Alternatives / Similar

curl-cffi

1,751 compare

hrequests

780 compare

requests

52,519 compare

node-fetch

8,825 compare

axios

106,345 compare

aiohttp

15,425 compare

httpx

13,703 compare

got

14,454 compare

superagent

16,610 compare

needle

1,637 compare

faraday

5,785 compare

httpclient

703 compare

undetected-chromedriver

10,683 compare

excon

1,163 compare

httparty

5,837 compare

pycurl

1,094 compare

typhoeus

4,084 compare

puppeteer-stealth

89,751 compare

httr

988 compare

rvest

1,498 compare

guzzle

23,055 compare

em-http-request

1,217 compare

symfony-http

1,976 compare

wreck

381 compare

http-2

898 compare

treq

590 compare

resty

10,341 compare

req

4,374 compare

nestful

505 compare

crul

107 compare

requests

3,576 compare

selenium-driverless

718 compare

buzz

1,913 compare

httpful

1,741 compare

ralger

156 compare

http.rb

3,013 compare

curl-impersonate

4,221 compare

hrequests

780 compare

requests

52,519 compare

node-fetch

8,825 compare

axios

106,345 compare

aiohttp

15,425 compare

httpx

13,703 compare

got

14,454 compare

superagent

16,610 compare

needle

1,637 compare

faraday

5,785 compare

httpclient

703 compare

undetected-chromedriver

10,683 compare

excon

1,163 compare

httparty

5,837 compare

pycurl

1,094 compare

typhoeus

4,084 compare

puppeteer-stealth

89,751 compare

httr

988 compare

rvest

1,498 compare

guzzle

23,055 compare

em-http-request

1,217 compare

symfony-http

1,976 compare

wreck

381 compare

http-2

898 compare

treq

590 compare

resty

10,341 compare

req

4,374 compare

nestful

505 compare

crul

107 compare

requests

3,576 compare

selenium-driverless

718 compare

buzz

1,913 compare

httpful

1,741 compare

ralger

156 compare

http.rb

3,013 compare