puppeteer-stealthvsselenium-driverless
Puppeteer Stealth is puppeteer plugin that fortifies headles browser for web scraping. This makes detection of puppeteer scrapers more difficult allowing to scrape targets which use headless browser detection techniques.
Puppeteer-stealth does this by applying various javascript patches to cover up traces of headless browser presence in the web scraping browser's environment.
Selenium Driverless is a Selenium inspired browser automation library with focus on web scraping detection bypass. It shares most of Selenium API and UX but implements several extensions that make the scraper more difficult to detect and extra usability features like: - Bypass Cloudflare - Multiple Tab scraping - Multiple context support - Proxy auth - Network interception
Example Use
const puppeteer = require('puppeteer-extra')
// add stealth plugin and use defaults (all evasion techniques)
const StealthPlugin = require('puppeteer-extra-plugin-stealth')
puppeteer.use(StealthPlugin())
// puppeteer usage as normal
puppeteer.launch({ headless: true }).then(async browser => {
console.log('Running tests..')
const page = await browser.newPage()
await page.goto('https://bot.sannysoft.com')
await page.waitForTimeout(5000)
await page.screenshot({ path: 'result.png', fullPage: true })
await browser.close()
console.log("success - check the result.png screenshot")
})
# It works the same as Selenium just with a different import.
import undetected_chromedriver as uc
driver = uc.Chrome(headless=True, use_subprocess=False)
driver.get('https://nowsecure.nl')
driver.save_screenshot('screenshot.png')
driver.close()