Why Headless Browsers Get Detected: A Technical Breakdown

March 12, 2026By imkyssa5 min read25 views

Why Headless Browsers Get Detected: A Technical Breakdown
howtocenterdiv.com — "Software engineering is more than just centering a div."

Puppeteer rats itself out in at least 11 different ways the moment it starts up — and that's before it's even loaded a single page. Scraping tutorials almost never bring this up, then act shocked when the same script that ran perfectly on localhost gets hammered in production.

Here's what people get wrong: bot detection isn't a single if-statement checking one flag. It's a scoring system. Each signal you leak adds weight to a total, and once that total crosses a threshold, you're done — blocked, CAPTCHAed, or worst of all, quietly served garbage data so you don't even know it happened.
LayerWhat It ChecksWhen
TLS Fingerprint (JA3)Cipher suite order, extensionsTCP handshake — before HTTP
HTTP/2 FingerprintFrame settings, header orderFirst request
navigator propertieswebdriver, plugins, languagesJS runtime
Canvas / WebGLRendering entropy, GPU stringJS runtime
Mouse & keyboardMovement patterns, timingBehavioral
IP reputationASN, datacenter rangeDNS / IP layer
Most developers fixate on the navigator layer. They patch webdriver, maybe fake the user agent, and call it a day. They have no idea TLS fingerprinting has already clocked them before a single line of JavaScript ran.
Detection is cumulative and concurrent. Failing one check won't get you blocked. Getting blocked happens because a handful of small failures push the score over the threshold together. You can dodge navigator.webdriver perfectly and still get caught — because your JA3, canvas fingerprint, and plugin list aren't telling the same story.

Signal #1 — navigator.webdriver

js
console.log(navigator.webdriver); // Headless → true (instant detection) // Real browser → undefined
The value itself isn't even the whole story. Detectors also inspect the property descriptor's configurability — that's a fingerprint of its own.
js
1// Looks like a fix, still detectable 2Object.defineProperty(navigator, 'webdriver', { get: () => false }); 3 4// What a detector actually sees 5Object.getOwnPropertyDescriptor(navigator, 'webdriver'); 6// → { value: false, writable: true, configurable: true } 7// In real Chrome, the property doesn't exist at all — just undefined
The classic mistake is patching properties after the page loads instead of injecting before. The clock doesn't wait.

Signal #2 — The plugins Array

js
1navigator.plugins.length; 2// Real Chrome → 3–7 3// Headless → 0 ← one-line detection 4 5navigator.mimeTypes.length; 6// Real Chrome → 2+ 7// Headless → 0
Any halfway-decent detection script checks navigator.plugins.length === 0 and stops right there. But stuffing in fake plugins isn't a real fix either. The names, descriptions, and mime types inside each plugin object all have to be internally consistent — and they have to match the user-agent you claimed. If your UA says Chrome 120 on macOS but your plugin list looks like Chrome on Windows, that mismatch is itself a signal.

Signal #3 — Canvas Fingerprinting

js
1const canvas = document.createElement('canvas'); 2const ctx = canvas.getContext('2d'); 3ctx.font = '11pt "Times New Roman"'; 4ctx.fillText('Cwm fjordbank', 2, 15); 5const fingerprint = canvas.toDataURL(); 6// Real machine → unique hash, varies by hardware 7// Headless Chrome → identical hash, every single time
Headless Chrome produces pixel-perfect identical output for identical code, no matter what machine it's running on. There's no GPU variance. Detection systems maintain databases of known headless canvas hashes. Yours is already in there.

Signal #4 — WebGL Renderer String

js
const gl = document.createElement('canvas').getContext('webgl'); const info = gl.getExtension('WEBGL_debug_renderer_info'); gl.getParameter(info.UNMASKED_VENDOR_WEBGL); // Real machine → "Intel Inc." / "NVIDIA Corporation" // Headless → "Google SwiftShader" ← banned everywhere
SwiftShader is Google's software renderer, built for display-less environments. That string has been identified and blacklisted across detection systems everywhere. If SwiftShader shows up, you're flagged — doesn't matter what else you've cleaned up.

Signal #5 — TLS / JA3 Fingerprint

The first move in a TLS handshake is the client sending a ClientHello. Inside it: the list of cipher suites, extensions, and elliptic curves the client supports. The ordering of those items is dictated by the underlying TLS library — not the user-agent string you set.
JA3 = md5(SSLVersion, Ciphers, Extensions, EllipticCurves, ECPointFormats)
ClientTLS LibraryJA3 Hash
Chrome 120 / macOSBoringSSLcd08e31494f9531f560d64c695473da9
Node.js 20 (axios/got)OpenSSLb32309a26951912be7dba376398abc3b
Python requestsPython ssl3b5074b1b5d032e5620f69f9f700ff0e
You can set User-Agent: Chrome/120 all you want. The TLS handshake already announced Node.js before any JavaScript touched the page. There is no JS-layer fix for this one.

Signal #6 — HTTP/2 Fingerprint

Real Chrome and Node's http2 module send different SETTINGS frames:
Chrome 120:  HEADER_TABLE_SIZE=65536, ENABLE_PUSH=0, INITIAL_WINDOW_SIZE=6291456
Node.js:     HEADER_TABLE_SIZE=4096,  ENABLE_PUSH=1, INITIAL_WINDOW_SIZE=65535
This gets extracted at the load balancer level, well before any application logic sees the request.

Signal #7 — Behavioral Entropy

js
// Bot movement mousemove: (100,200) (400,200) (400,500) // perfect L-shapes, instant // Human movement mousemove: (100,200) (138,213) (201,228)... // curved, variable speed
Mouse movement is just one piece. Detection systems also profile keystroke timing (real humans: 50–200ms between keystrokes), scroll behavior, and how long someone spends on a page before doing anything. A bot that clicks 80 milliseconds after page load is a bot.
Math.random() delays don't fix this. A straight-line mouse path with randomized timing is still a straight-line mouse path.
Entropy scores accumulate across full sessions, not just individual events. That's why some bots clear the first checkpoint and get flagged 30 seconds later — the score built up over time, not all at once.

The Mistake Matrix

MistakeWhy It Fails
Only patching navigator.webdriver10+ signals still leak
Using got/axios with spoofed headersJA3 still says Node.js
No --disable-blink-features=AutomationControlledwindow.chrome exposes automation flag
Datacenter proxies (AWS/GCP/Azure)ASN is blacklisted before fingerprint checks even run
User-Agent without matching sec-ch-uaHeader contradiction — caught immediately
Math.random() delays onlyTiming variance isn't behavioral entropy

How the Score Actually Adds Up

All checks run simultaneously, scores stack:
code
1IP reputation: +0.1 (clean, residential) 2JA3 mismatch: +0.6 (Node.js TLS on Chrome UA) 3navigator.webdriver: +0.0 (patched correctly) 4Canvas hash: +0.4 (known headless hash) 5Plugin count: +0.3 (empty plugins) 6Mouse entropy: +0.5 (straight-line movement) 7───────────────────────────────────────────── 8Total: 1.9 → Block threshold: 1.5
You nailed navigator. Doesn't matter — TLS and canvas alone already pushed it over.

The sec-ch-ua Problem

Chrome 90+ attaches client hints to every request. A real Chrome session looks like:
http
sec-ch-ua: "Chromium";v="120", "Google Chrome";v="120", "Not-A.Brand";v="99" sec-ch-ua-mobile: ?0 sec-ch-ua-platform: "macOS" User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)...
A Puppeteer session that sets a modern user-agent but sends no client hints looks like:
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)...
# sec-ch-ua: missing entirely
A Chrome 90+ user-agent without sec-ch-ua is a physical impossibility. Flagged on arrival. And just being present isn't enough — the brand token order and version numbers have to be consistent with the full UA string.

Tools vs. What They Actually Cover

ToolFixesDoesn't Fix
puppeteer-extra-plugin-stealthJS-layer signalsTLS/HTTP2 fingerprint
rebrowser-puppeteerCDP leaks, runtime injectionTLS, behavioral
Go + CycleTLSJA3 fingerprintBehavioral, canvas
Real Chrome via CDPTLS, canvas, GPUProxy/IP reputation
The only client that passes every layer by default is a real Chrome browser, running on consumer hardware, behind a residential IP.
Most guides don't even touch this. They treat detection as a checklist — knock off each item, done. But detection systems are probabilistic. They don't need certainty, just confidence. Fix 9 out of 10 signals and you can still get blocked if that last signal carries enough weight. JA3 mismatches typically score 0.5–0.7. One leak can be all it takes.

You might also like