Skip to content

selenium-driverlessvshrequests

NOASSERTION 14 1 718
6.5 thousand (month) Jul 22 2022 1.9.4(2024-10-22 01:41:19 ago)
1,001 1 51 MIT
Feb 23 2022 33.3 thousand (month) 0.9.2(2024-12-01 02:55:27 ago)

Selenium Driverless is a Selenium inspired browser automation library with focus on web scraping detection bypass. It shares most of Selenium API and UX but implements several extensions that make the scraper more difficult to detect and extra usability features like: - Bypass Cloudflare - Multiple Tab scraping - Multiple context support - Proxy auth - Network interception

hrequests is a feature rich modern replacement for a famous requests library for Python. It provides a feature rich HTTP client capable of resisting popular scraper identification techniques: - Seamless transition between headless browser and http client based requests - Integrated HTML parser - Mimicking of real browser TLS fingerprints - Javascript rendering - HTTP2 support - Realistic browser headers

Highlights


bypasshttp2tls-fingerprinthttp-fingerprintsyncasync

Example Use


```python # It works the same as Selenium just with a different import. import undetected_chromedriver as uc driver = uc.Chrome(headless=True, use_subprocess=False) driver.get('https://nowsecure.nl') driver.save_screenshot('screenshot.png') driver.close() ```
hrequests has almost identical API and UX as requests and here's a quick overview: ```python import hrequests # perform HTTP client requests resp = hrequests.get('https://httpbin.org/html') print(resp.status_code) # 200 # use headless browsers and sessions: session = hrequests.Session('chrome', version=122, os="mac") # supports asyncio and easy concurrency requests = [ hrequests.async_get('https://www.google.com/', browser='firefox'), hrequests.async_get('https://www.duckduckgo.com/'), hrequests.async_get('https://www.yahoo.com/'), hrequests.async_get('https://www.httpbin.org/'), ] responses = hrequests.map(requests, size=3) # max 3 conccurency ```

Alternatives / Similar


Was this page helpful?