requestsvscurl-cffi
PHP library "Requests" is an HTTP library written in PHP, for making HTTP requests. It's heavily inspired by a popular Python library called Requests and aims for the same goals of simplifying HTTP client complexities.
It abstracts the complexities of making requests behind a simple API so that you can focus on interacting with services and consuming data in your application.
Requests allows you to send HTTP/1.1 HEAD, GET, POST, PUT, DELETE, and PATCH HTTP requests. You can add headers, form data, multipart files, and parameters with basic arrays, and access the response data in the same way.
Requests uses cURL and fsockopen, depending on what your system has available, but abstracts all the nasty stuff out of your way, providing a consistent API.
Features:
- International Domains and URLs
- Browser-style SSL Verification
- Basic/Digest Authentication
- Automatic Decompression
- Connection Timeouts
Curl-cffi is a Python library for implementing curl-impersonate which is a
HTTP client that appears as one of popular web browsers like:
- Google Chrome
- Microsoft Edge
- Safari
- Firefox
Unlike requests
and httpx
which are native Python libraries, curl-cffi
uses cURL and inherits it's powerful features
like extensive HTTP protocol support and detection patches for TLS and HTTP fingerprinting.
Using curl-cffi web scrapers can bypass TLS and HTTP fingerprinting.
Highlights
Example Use
require 'vendor/autoload.php';
use Requests;
// make GET request
$response = Requests::get('https://httpbin.org/get');
echo $response->status_code;
// make POST request
$data = array('name' => 'Bob', 'age' => 35);
$options = array('auth' => array('user', 'pass'));
$response = Requests::post('https://httpbin.org/post', array(), $data, $options);
echo $response->status_code;
from curl_cffi import requests
response = requests.get('https://httpbin.org/json')
print(response.json())
# or using sessions
session = requests.Session()
response = session.get('https://httpbin.org/json')
# also supports async requests using asyncio
import asyncio
from curl_cffi.requests import AsyncSession
urls = [
"http://httpbin.org/html",
"http://httpbin.org/html",
"http://httpbin.org/html",
]
async with AsyncSession() as s:
tasks = []
for url in urls:
task = s.get(url)
tasks.append(task)
# scrape concurrently:
responses = await asyncio.gather(*tasks)
# also supports websocket connections
from curl_cffi.requests import Session, WebSocket
def on_message(ws: WebSocket, message):
print(message)
with Session() as s:
ws = s.ws_connect(
"wss://api.gemini.com/v1/marketdata/BTCUSD",
on_message=on_message,
)
ws.run_forever()