Skip to content

hrequestsvshttr

MIT 51 1 1,001
33.3 thousand (month) Feb 23 2022 0.9.2(2024-12-01 02:55:27 ago)
985 9 10 MIT
May 06 2012 1.2 million (month) 1.4.8(2023-08-15 11:00:00 ago)

hrequests is a feature rich modern replacement for a famous requests library for Python. It provides a feature rich HTTP client capable of resisting popular scraper identification techniques: - Seamless transition between headless browser and http client based requests - Integrated HTML parser - Mimicking of real browser TLS fingerprints - Javascript rendering - HTTP2 support - Realistic browser headers

The aim of httr is to provide a wrapper for the curl package, customised to the demands of modern web APIs.

Key features:

  • Functions for the most important http verbs: GET(), HEAD(), PATCH(), PUT(), DELETE() and POST().
  • Automatic connection sharing across requests to the same website (by default, curl handles are managed automatically), cookies are maintained across requests, and a up-to-date root-level SSL certificate store is used.
  • Requests return a standard reponse object that captures the http status line, headers and body, along with other useful information.
  • Response content is available with content() as a raw vector (as = "raw"), a character vector (as = "text"), or parsed into an R object (as = "parsed"), currently for html, xml, json, png and jpeg.
  • You can convert http errors into R errors with stop_for_status().
  • Config functions make it easier to modify the request in common ways: set_cookies(), add_headers(), authenticate(), use_proxy(), verbose(), timeout(), content_type(), accept(), progress().
  • Support for OAuth 1.0 and 2.0 with oauth1.0_token() and oauth2.0_token(). The demo directory has eight OAuth demos: four for 1.0 (twitter, vimeo, withings and yahoo) and four for 2.0 (facebook, github, google, linkedin). OAuth credentials are automatically cached within a project.

Highlights


bypasshttp2tls-fingerprinthttp-fingerprintsyncasync

Example Use


hrequests has almost identical API and UX as requests and here's a quick overview: ```python import hrequests # perform HTTP client requests resp = hrequests.get('https://httpbin.org/html') print(resp.status_code) # 200 # use headless browsers and sessions: session = hrequests.Session('chrome', version=122, os="mac") # supports asyncio and easy concurrency requests = [ hrequests.async_get('https://www.google.com/', browser='firefox'), hrequests.async_get('https://www.duckduckgo.com/'), hrequests.async_get('https://www.yahoo.com/'), hrequests.async_get('https://www.httpbin.org/'), ] responses = hrequests.map(requests, size=3) # max 3 conccurency ```
```r library(httr) # GET requests: resp <- GET("http://httpbin.org/get") status_code(resp) # status code headers(resp) # headers str(content(resp)) # body # POST requests: # Form encoded resp <- POST(url, body = body, encode = "form") # Multipart encoded resp <- POST(url, body = body, encode = "multipart") # JSON encoded resp <- POST(url, body = body, encode = "json") # setting cookies: resp <- GET("http://httpbin.org/cookies", set_cookies("MeWant" = "cookies")) content(r)$cookies # get response cookies ```

Alternatives / Similar


Was this page helpful?