Primp is a Python HTTP client that impersonates real web browsers by replicating their
TLS fingerprints, HTTP/2 settings, and header ordering. It is a lightweight alternative
to curl-cffi for bypassing TLS and HTTP fingerprinting-based bot detection.
Key features include:
- Browser impersonation
Can impersonate Chrome, Firefox, Safari, Edge, and OkHttp clients by replicating their
exact TLS fingerprints (JA3/JA4), HTTP/2 frame settings, header ordering, and other
connection-level characteristics.
- HTTP/2 support
Full HTTP/2 support with configurable settings that match real browser behavior.
- Lightweight
Smaller and simpler than curl-cffi while providing similar impersonation capabilities.
Built on Rust for performance.
- Familiar API
Provides a requests-like API with Session support, making it easy to adopt for
developers familiar with the Python requests library.
- Proxy support
HTTP and SOCKS5 proxy support with authentication.
- Cookie management
Automatic cookie handling across requests within a session.
Primp fills a similar niche to curl-cffi and hrequests — HTTP clients designed to avoid
TLS/HTTP fingerprinting — but takes a Rust-powered approach for better performance. It is
particularly useful when you need to bypass bot detection that relies on connection-level
fingerprinting without using a full browser.
ralger is a small web scraping framework for R based on rvest and xml2.
It's goal to simplify basic web scraping and it provides a convenient and easy to use API.
It offers functions for retrieving pages, parsing HTML using CSS selectors, automatic table parsing and
auto link, title, image and paragraph extraction.
```python
import primp
# Create a session that impersonates Chrome
session = primp.Session(impersonate="chrome_131")
# Make requests - TLS fingerprint matches real Chrome
response = session.get("https://example.com")
print(response.status_code)
print(response.text)
# POST with JSON data
response = session.post(
"https://api.example.com/data",
json={"key": "value"},
)
# With proxy
session = primp.Session(
impersonate="firefox_133",
proxy="http://user:pass@proxy.example.com:8080",
)
response = session.get("https://example.com")
# Different browser impersonation profiles
for browser in ["chrome_131", "firefox_133", "safari_18", "edge_131"]:
session = primp.Session(impersonate=browser)
resp = session.get("https://tls.peet.ws/api/all")
print(f"{browser}: {resp.json()['ja3_hash']}")
```
```r
library("ralger")
url <- "http://www.shanghairanking.com/rankings/arwu/2021"
# retrieve HTML and select elements using CSS selectors:
best_uni <- scrap(link = url, node = "a span", clean = TRUE)
head(best_uni, 5)
#> [1] "Harvard University"
#> [2] "Stanford University"
#> [3] "University of Cambridge"
#> [4] "Massachusetts Institute of Technology (MIT)"
#> [5] "University of California, Berkeley"
# ralger can also parse HTML attributes
attributes <- attribute_scrap(
link = "https://ropensci.org/",
node = "a", # the a tag
attr = "class" # getting the class attribute
)
head(attributes, 10) # NA values are a tags without a class attribute
#> [1] "navbar-brand logo" "nav-link" NA
#> [4] NA NA "nav-link"
#> [7] NA "nav-link" NA
#> [10] NA
#
# ralger can automatically scrape tables:
data <- table_scrap(link ="https://www.boxofficemojo.com/chart/top_lifetime_gross/?area=XWW")
head(data)
#> # A tibble: 6 × 4
#> Rank Title `Lifetime Gross` Year
#>
#> 1 1 Avatar $2,847,397,339 2009
#> 2 2 Avengers: Endgame $2,797,501,328 2019
#> 3 3 Titanic $2,201,647,264 1997
#> 4 4 Star Wars: Episode VII - The Force Awakens $2,069,521,700 2015
#> 5 5 Avengers: Infinity War $2,048,359,754 2018
#> 6 6 Spider-Man: No Way Home $1,901,216,740 2021
```