Anti-Bot Protections

Modern websites use anti-bot protection systems to detect and block automated traffic, including web scrapers. Understanding these systems is essential for building reliable scrapers.

Need to bypass anti-bot protections?

Scrapfly handles anti-bot bypass automatically for all major protection systems with a single API parameter. See the bypass hub for details, or check Scrapeway benchmarks for independent success rate data.

How Anti-Bot Systems Work

Most anti-bot systems use a combination of these detection methods:

Detection Method	What It Does	Hard to Evade?
TLS Fingerprinting	Analyzes the TLS handshake (cipher suites, extensions, curves) to identify the client. Each browser has a unique TLS signature (JA3/JA4 hash). Standard HTTP libraries have different signatures than real browsers.	Yes
HTTP/2 Fingerprinting	Examines HTTP/2 frame settings, header ordering, and priority schemes that differ between browsers and HTTP libraries.	Yes
JavaScript Challenges	Injects obfuscated JavaScript that must be executed correctly. Verifies the browser environment is genuine.	Moderate
Browser Fingerprinting	Collects Canvas, WebGL, Audio context, fonts, screen resolution, and other browser properties to build a device fingerprint.	Hard
Behavioral Analysis	Monitors mouse movements, scroll patterns, click timing, and navigation flow to distinguish humans from bots.	Very Hard
CAPTCHA Challenges	Presents visual or interactive challenges (Turnstile, reCAPTCHA, hCaptcha, FunCaptcha) that require human solving or CAPTCHA API services.	Hard
IP Reputation	Checks the request IP against databases of known data centers, VPNs, and proxy services. Residential IPs are trusted more.	Moderate

Major Anti-Bot Systems

Cloudflare

The most widely deployed anti-bot system, protecting millions of websites. Cloudflare uses a layered approach combining TLS fingerprinting, JavaScript challenges (Turnstile CAPTCHA), and behavioral analysis.


Detection	TLS fingerprinting (JA3/JA4), Turnstile CAPTCHA, JS challenges, HTTP header validation
Difficulty	Moderate to Hard
Used By	Indeed, and hundreds of thousands of other websites
Bypass	Bypass Cloudflare with Scrapfly (98% success rate)

Cloudflare is the most common anti-bot system you will encounter. For scraping Cloudflare-protected sites, standard HTTP libraries will not work. You need either a TLS fingerprint library like curl-cffi or primp, an anti-detect browser, or a web scraping API.

DataDome

One of the most sophisticated anti-bot systems, using per-customer ML models that learn from each website's unique traffic patterns. DataDome is very difficult to bypass at scale because its behavioral analysis continuously adapts.


Detection	Real-time ML models, device fingerprinting, behavioral analysis (mouse, scroll, keyboard), slider CAPTCHA
Difficulty	Very Hard
Used By	Etsy, TripAdvisor, Foot Locker, SoundCloud
Bypass	Bypass DataDome with Scrapfly (96% success rate)

DataDome's per-customer ML models make each website unique, so bypass techniques that work on one site may not work on another. For reliable scraping, a web scraping API is usually the best approach.

Akamai Bot Manager

Akamai's anti-bot solution is deployed at the CDN edge, making it fast and hard to circumvent. TLS fingerprinting is its primary detection vector, combined with sensor data validation and device fingerprinting.


Detection	TLS fingerprinting, sensor data (_abck cookies), device fingerprinting, behavioral analysis
Difficulty	Hard to Very Hard
Used By	Major enterprise websites across finance, retail, and media
Bypass	Bypass Akamai with Scrapfly (97% success rate)

Akamai's TLS fingerprinting is particularly effective because it blocks at the edge before requests even reach the origin server. Libraries like curl-cffi that impersonate browser TLS fingerprints are essential for direct bypass attempts.

PerimeterX (HUMAN Security)

PerimeterX (now HUMAN Security) uses sophisticated behavioral biometrics to detect automation. It tracks mouse movements, click patterns, keystroke timing, and navigation sequences to build a behavioral profile.


Detection	Behavioral biometrics (_px cookies), Human Challenge, browser fingerprinting, IP reputation
Difficulty	Hard
Used By	Zillow, StockX, Wayfair, Booking.com, Craigslist
Bypass	Bypass PerimeterX with Scrapfly (95% success rate)

PerimeterX is commonly found on e-commerce and real estate websites. For scraping targets like Zillow or StockX, you will need to handle PerimeterX challenges.

Kasada

Kasada uses proof-of-work challenges that require computational resources to solve, combined with behavioral analysis and threat intelligence. This makes automated bypass more expensive.


Detection	Proof-of-work challenges, kas.js cookies, behavioral analysis, threat intelligence
Difficulty	Hard
Used By	Realtor.com and other high-value targets
Bypass	Bypass Kasada with Scrapfly (94% success rate)

Kasada's proof-of-work approach means each request costs computational time, making high-volume bypass expensive. For scraping Realtor.com and similar Kasada-protected sites, a web scraping API is the practical choice.

Imperva / Incapsula

One of the oldest WAF/anti-bot providers. Imperva collects 180+ encrypted values via client-side JavaScript to build a trust score for each visitor.


Detection	reese84 challenges, incap_ses cookies, JS fingerprinting (180+ signals), behavioral analysis
Difficulty	Moderate to Hard
Used By	Enterprise websites across healthcare, finance, and government
Bypass	Bypass Incapsula with Scrapfly (96% success rate)

Imperva typically returns 403 errors when it detects automated traffic. The reese84 challenge mechanism requires JavaScript execution, so basic HTTP clients will be blocked.

F5 Shape Security

F5's bot defense uses a randomized virtual machine with custom opcodes for client-side detection, making it one of the hardest anti-bot systems to reverse engineer.


Detection	VM-based obfuscation, TS cookies, BIG-IP ASM, TLS fingerprinting, client-side protection (f5_cspm)
Difficulty	Very Hard
Used By	Major airlines, banks, and enterprise sites
Bypass	Bypass F5 with Scrapfly (95% success rate)

F5's VM-based obfuscation is extremely difficult to reverse engineer. Standard headless browsers and basic HTTP clients have no chance. This is one of the few anti-bot systems where a web scraping API is almost always the right approach.

AWS WAF Bot Control

Amazon's cloud-native WAF with two detection levels: Common (self-identifying bots) and Targeted (ML-based detection of sophisticated bots).


Detection	ML analysis, aws-waf-token cookies, challenge.js scripts, Bot Control rules
Difficulty	Moderate
Used By	Amazon (CloudFront WAF), and websites hosted on AWS
Bypass	Bypass AWS WAF with Scrapfly (96% success rate)

AWS WAF is one of the easier anti-bot systems to handle compared to specialized providers. For scraping Amazon specifically, the main challenge is CloudFront WAF which uses AWS WAF Bot Control under the hood.

Arkose Labs / FunCaptcha

Arkose Labs specializes in interactive CAPTCHA challenges (gamified puzzles, 3D object manipulation) combined with behavioral deep scanning.


Detection	Gamified CAPTCHA challenges, behavioral deep scan, risk profiling, device fingerprinting
Difficulty	Hard (requires CAPTCHA solving)
Used By	LinkedIn, Adobe, Roblox, Microsoft, OpenAI

Arkose Labs challenges require either manual solving, CAPTCHA solving services (2captcha, anti-captcha), or specialized automation. For scraping LinkedIn, Arkose Labs (FunCaptcha) is the primary challenge.

Difficulty Ranking

From easiest to hardest to bypass:

Rank	System	Difficulty	Primary Detection
1	AWS WAF	Moderate	ML + token cookies
2	Imperva	Moderate-Hard	JS fingerprinting + reese84
3	Cloudflare	Moderate-Hard	TLS + Turnstile
4	PerimeterX	Hard	Behavioral biometrics
5	Kasada	Hard	Proof-of-work
6	Arkose Labs	Hard	Interactive CAPTCHA
7	Akamai	Hard-Very Hard	TLS + sensor data
8	DataDome	Very Hard	Per-customer ML
9	F5 Shape	Very Hard	VM obfuscation

Choosing the Right Approach

Your Situation	Recommended Approach
No anti-bot detected	Standard HTTP client (httpx, requests)
TLS/HTTP fingerprinting only	curl-cffi, primp
JavaScript challenges	Anti-detect browser or headless browser
CAPTCHA challenges	CAPTCHA solving service + browser automation
Multiple protection layers	Web scraping API like Scrapfly
Production at scale	Web scraping API for reliability and maintenance-free operation

For an independent comparison of how well different scraping APIs handle anti-bot protections, see Scrapeway's benchmarks.

Identify Anti-Bot Protection

Not sure which anti-bot system a website uses? Try Scrapfly's Anti-Bot Detector to identify the protection system before you start building your scraper.

Browser Libraries - anti-detect browsers and TLS fingerprint tools
Browser Automation - Playwright, Puppeteer, Selenium
Web Scrapers - ready-to-use scrapers for popular protected websites
Frameworks - web scraping frameworks