Skip to content

needlevscurl-cffi

MIT 86 4 1,624
32.1 million (month) Dec 11 2011 3.3.1(10 months ago)
1,751 2 34 MIT
Feb 23 2022 594.9 thousand (month) 0.7.1(3 months ago)

needle is an HTTP client library for Node.js that provides a simple, flexible, and powerful API for making HTTP requests. It supports all major HTTP methods and has a clean and easy-to-use interface for handling responses and errors.

Curl-cffi is a Python library for implementing curl-impersonate which is a HTTP client that appears as one of popular web browsers like: - Google Chrome - Microsoft Edge - Safari - Firefox Unlike requests and httpx which are native Python libraries, curl-cffi uses cURL and inherits it's powerful features like extensive HTTP protocol support and detection patches for TLS and HTTP fingerprinting.

Using curl-cffi web scrapers can bypass TLS and HTTP fingerprinting.

Highlights


bypasshttp2tls-fingerprinthttp-fingerprintsyncasync

Example Use


const needle = require('needle');

// needle supports both Promises and async/await
needle.get('https://httpbin.org/get', (err, res) => {
    if (err) {
        console.error(err);
        return;
    }
    console.log(res.body);
});

const response = await needle.get('https://httpbin.org/get')

// concurrent requests can be sent using Promise.all
const results = await Promise.all([
  needle.get('http://httpbin.org/html'),
  needle.get('http://httpbin.org/html'),
  needle.get('http://httpbin.org/html'),
])

// POST requests
const data = { name: 'John Doe' };
await needle.post('https://api.example.com', data)

// proxy
const options = {
    proxy: 'http://proxy.example.com:8080'
};
await needle.get('https://httpbin.org/ip', options)

// headers and cookies
const options = {
  headers: {
      'Cookie': 'myCookie=123',
      'X-My-Header': 'myValue'
  }
};
await needle.get('https://httpbin.org/headers', options)
curl-cffi can be accessed as low-level curl client as well as an easy high-level HTTP client:
from curl_cffi import requests

response = requests.get('https://httpbin.org/json')
print(response.json())

# or using sessions
session = requests.Session()
response = session.get('https://httpbin.org/json')

# also supports async requests using asyncio
import asyncio
from curl_cffi.requests import AsyncSession

urls = [
  "http://httpbin.org/html",
  "http://httpbin.org/html",
  "http://httpbin.org/html",
]

async with AsyncSession() as s:
    tasks = []
    for url in urls:
        task = s.get(url)
        tasks.append(task)
    # scrape concurrently:
    responses = await asyncio.gather(*tasks)

# also supports websocket connections
from curl_cffi.requests import Session, WebSocket

def on_message(ws: WebSocket, message):
    print(message)

with Session() as s:
    ws = s.ws_connect(
        "wss://api.gemini.com/v1/marketdata/BTCUSD",
        on_message=on_message,
    )
    ws.run_forever()

Alternatives / Similar


Was this page helpful?