Skip to content

axiosvscurl-cffi

MIT 676 15 105,935
240.7 million (month) Aug 29 2014 1.7.9(6 days ago)
1,751 2 34 MIT
Feb 23 2022 594.9 thousand (month) 0.7.1(4 months ago)

axios is a popular JavaScript library that allows you to make HTTP requests from a Node.js environment. It is a promise-based library that works in both the browser and Node.js. It is similar to the Fetch API, but with a more powerful feature set and better browser compatibility.

One of the main benefits of using axios is that it automatically transforms the response data into a JSON object, making it easy to work with.

Axios is known for user-friendly API and support for asynchronous async/await syntax making it very accessible in web scraping.

Curl-cffi is a Python library for implementing curl-impersonate which is a HTTP client that appears as one of popular web browsers like: - Google Chrome - Microsoft Edge - Safari - Firefox Unlike requests and httpx which are native Python libraries, curl-cffi uses cURL and inherits it's powerful features like extensive HTTP protocol support and detection patches for TLS and HTTP fingerprinting.

Using curl-cffi web scrapers can bypass TLS and HTTP fingerprinting.

Highlights


bypasshttp2tls-fingerprinthttp-fingerprintsyncasync

Example Use


// axios can be used with promises:
axios.get('http://httpbin.org/json')
  .then(response => {
    console.log(response.data);
  })
  .catch(error => {
    console.log(error);
  });

// or async await syntax:
var resp = await axios.get('http://httpbin.org/json');
console.log(resp.data);

// to make requests concurrently Promise.all function can be used:
const results = await Promise.all([
  axios.get('http://httpbin.org/html'),
  axios.get('http://httpbin.org/html'),
  axios.get('http://httpbin.org/html'),
])

// axios also supports other type of requests like POST and even automatically serialize them:
await axios.post('http://httpbin.org/post', {'query': 'hello world'});
// or formdata
const data = {name: 'John Doe', email: 'johndoe@example.com'};

await axios.post('https://jsonplaceholder.typicode.com/users',
    querystring.stringify(data), 
    {
        headers: {
            'Content-Type': 'application/x-www-form-urlencoded'
        }
    }
);

// default values like headers can be configured globally
axios.defaults.headers.common['User-Agent'] = 'webscraping.fyi';
// or for session instance:
const instance = axios.create({
  headers: {"User-Agent": "webscraping.fyi"},
})
curl-cffi can be accessed as low-level curl client as well as an easy high-level HTTP client:
from curl_cffi import requests

response = requests.get('https://httpbin.org/json')
print(response.json())

# or using sessions
session = requests.Session()
response = session.get('https://httpbin.org/json')

# also supports async requests using asyncio
import asyncio
from curl_cffi.requests import AsyncSession

urls = [
  "http://httpbin.org/html",
  "http://httpbin.org/html",
  "http://httpbin.org/html",
]

async with AsyncSession() as s:
    tasks = []
    for url in urls:
        task = s.get(url)
        tasks.append(task)
    # scrape concurrently:
    responses = await asyncio.gather(*tasks)

# also supports websocket connections
from curl_cffi.requests import Session, WebSocket

def on_message(ws: WebSocket, message):
    print(message)

with Session() as s:
    ws = s.ws_connect(
        "wss://api.gemini.com/v1/marketdata/BTCUSD",
        on_message=on_message,
    )
    ws.run_forever()

Alternatives / Similar


Was this page helpful?