httpfulvscurl-cffi
Httpful is a simple Http Client library for PHP 7.2+. There is an emphasis of readability, simplicity, and flexibility – basically provide the features and flexibility to get the job done and make those features really easy to use.
Features
- Readable HTTP Method Support (GET, PUT, POST, DELETE, HEAD, PATCH and OPTIONS)
- Custom Headers
- Automatic "Smart" Parsing
- Automatic Payload Serialization
- Basic Auth
- Client Side Certificate Auth
- Request "Templates"
Curl-cffi is a Python library for implementing curl-impersonate which is a
HTTP client that appears as one of popular web browsers like:
- Google Chrome
- Microsoft Edge
- Safari
- Firefox
Unlike requests
and httpx
which are native Python libraries, curl-cffi
uses cURL and inherits it's powerful features
like extensive HTTP protocol support and detection patches for TLS and HTTP fingerprinting.
Using curl-cffi web scrapers can bypass TLS and HTTP fingerprinting.
Highlights
bypasshttp2tls-fingerprinthttp-fingerprintsyncasync
Example Use
require 'vendor/autoload.php';
use Httpful\Request;
// make GET request
$response = \Httpful\Request::get("http://httpbin.org/get")
->send();
echo $response->body;
// make POST request
$data = array('name' => 'Bob', 'age' => 35);
$response = \Httpful\Request::post("http://httpbin.org/post")
->sendsJson()
->body(json_encode($data))
->send();
echo $response->body;
// add headers or cookies
$response = \Httpful\Request::get("http://httpbin.org/headers")
->addHeader("API-KEY", "mykey")
->addHeader("Cookie", "foo=bar")
->send();
echo $response->body;
curl-cffi can be accessed as low-level curl client as well as an easy high-level HTTP client:
from curl_cffi import requests
response = requests.get('https://httpbin.org/json')
print(response.json())
# or using sessions
session = requests.Session()
response = session.get('https://httpbin.org/json')
# also supports async requests using asyncio
import asyncio
from curl_cffi.requests import AsyncSession
urls = [
"http://httpbin.org/html",
"http://httpbin.org/html",
"http://httpbin.org/html",
]
async with AsyncSession() as s:
tasks = []
for url in urls:
task = s.get(url)
tasks.append(task)
# scrape concurrently:
responses = await asyncio.gather(*tasks)
# also supports websocket connections
from curl_cffi.requests import Session, WebSocket
def on_message(ws: WebSocket, message):
print(message)
with Session() as s:
ws = s.ws_connect(
"wss://api.gemini.com/v1/marketdata/BTCUSD",
on_message=on_message,
)
ws.run_forever()