Skip to content

primpvsralger

MIT 3 1 504
7.1 million (month) Jun 01 2024 1.2.2(2026-04-03 07:11:15 ago)
165 1 3 MIT
Dec 22 2019 327 (month) 2.3.0(2021-03-18 00:10:00 ago)

Primp is a Python HTTP client that impersonates real web browsers by replicating their TLS fingerprints, HTTP/2 settings, and header ordering. It is a lightweight alternative to curl-cffi for bypassing TLS and HTTP fingerprinting-based bot detection.

Key features include:

  • Browser impersonation Can impersonate Chrome, Firefox, Safari, Edge, and OkHttp clients by replicating their exact TLS fingerprints (JA3/JA4), HTTP/2 frame settings, header ordering, and other connection-level characteristics.
  • HTTP/2 support Full HTTP/2 support with configurable settings that match real browser behavior.
  • Lightweight Smaller and simpler than curl-cffi while providing similar impersonation capabilities. Built on Rust for performance.
  • Familiar API Provides a requests-like API with Session support, making it easy to adopt for developers familiar with the Python requests library.
  • Proxy support HTTP and SOCKS5 proxy support with authentication.
  • Cookie management Automatic cookie handling across requests within a session.

Primp fills a similar niche to curl-cffi and hrequests — HTTP clients designed to avoid TLS/HTTP fingerprinting — but takes a Rust-powered approach for better performance. It is particularly useful when you need to bypass bot detection that relies on connection-level fingerprinting without using a full browser.

ralger is a small web scraping framework for R based on rvest and xml2.

It's goal to simplify basic web scraping and it provides a convenient and easy to use API.

It offers functions for retrieving pages, parsing HTML using CSS selectors, automatic table parsing and auto link, title, image and paragraph extraction.

Highlights


bypasstls-fingerprinthttp-fingerprinthttp2fast

Example Use


```python import primp # Create a session that impersonates Chrome session = primp.Session(impersonate="chrome_131") # Make requests - TLS fingerprint matches real Chrome response = session.get("https://example.com") print(response.status_code) print(response.text) # POST with JSON data response = session.post( "https://api.example.com/data", json={"key": "value"}, ) # With proxy session = primp.Session( impersonate="firefox_133", proxy="http://user:pass@proxy.example.com:8080", ) response = session.get("https://example.com") # Different browser impersonation profiles for browser in ["chrome_131", "firefox_133", "safari_18", "edge_131"]: session = primp.Session(impersonate=browser) resp = session.get("https://tls.peet.ws/api/all") print(f"{browser}: {resp.json()['ja3_hash']}") ```
```r library("ralger") url <- "http://www.shanghairanking.com/rankings/arwu/2021" # retrieve HTML and select elements using CSS selectors: best_uni <- scrap(link = url, node = "a span", clean = TRUE) head(best_uni, 5) #> [1] "Harvard University" #> [2] "Stanford University" #> [3] "University of Cambridge" #> [4] "Massachusetts Institute of Technology (MIT)" #> [5] "University of California, Berkeley" # ralger can also parse HTML attributes attributes <- attribute_scrap( link = "https://ropensci.org/", node = "a", # the a tag attr = "class" # getting the class attribute ) head(attributes, 10) # NA values are a tags without a class attribute #> [1] "navbar-brand logo" "nav-link" NA #> [4] NA NA "nav-link" #> [7] NA "nav-link" NA #> [10] NA # # ralger can automatically scrape tables: data <- table_scrap(link ="https://www.boxofficemojo.com/chart/top_lifetime_gross/?area=XWW") head(data) #> # A tibble: 6 × 4 #> Rank Title `Lifetime Gross` Year #> #> 1 1 Avatar $2,847,397,339 2009 #> 2 2 Avengers: Endgame $2,797,501,328 2019 #> 3 3 Titanic $2,201,647,264 1997 #> 4 4 Star Wars: Episode VII - The Force Awakens $2,069,521,700 2015 #> 5 5 Avengers: Infinity War $2,048,359,754 2018 #> 6 6 Spider-Man: No Way Home $1,901,216,740 2021 ```

Alternatives / Similar


Was this page helpful?