Puppeteer Stealth: How to Avoid Bot Detection (Without Getting Your IP Burned)

If you’re searching for puppeteer stealth, you’ve probably experienced one of these:

  • your script works locally, but the server blocks you in production
  • you get CAPTCHAs after a few pages
  • you see “Access denied”, “unusual traffic”, 403, or a blank HTML shell
  • you’re rotating user agents, but your IP still gets burned

This guide is the practical truth:

  • stealth plugins help — but they’re not a silver bullet
  • fingerprint tricks can backfire
  • most “stealth success” is actually crawl design: pacing, session reuse, and network strategy

We’ll cover:

  1. what bot defenses look at (in 2026)
  2. what puppeteer-extra-plugin-stealth actually changes
  3. how to detect blocks programmatically
  4. when rotating IPs beats fingerprint hacks
  5. patterns that keep your IPs alive longer
When stealth isn’t enough, make the network layer stable

Stealth tweaks fingerprints. But many blocks are IP/rate-limit driven. ProxiesAPI gives you a proxy-backed fetch URL (and optional rendering) so you can design crawls that burn fewer IPs and complete more runs.


1) How modern bot detection works (high level)

Most defenses score you across multiple signals:

Network signals

  • IP reputation (datacenter vs residential vs “dirty” IP)
  • request rate / burstiness
  • ASN concentration (too many requests from one provider)
  • geo mismatch (login region vs IP)

Browser / fingerprint signals

  • headless indicators
  • inconsistent properties (e.g., WebGL vendor doesn’t match platform)
  • missing APIs / permissions weirdness
  • automation artifacts (webdriver, unusual timing patterns)

Behavior signals

  • no scrolling / no mouse
  • clicks too fast
  • always the same path
  • never loading images/fonts (resource patterns)

Content / target signals

  • scraping “hot” endpoints that are heavily protected
  • hitting the same page repeatedly

Stealth tools only address part of the picture.


2) What puppeteer-extra-plugin-stealth changes

puppeteer-extra-plugin-stealth is a bundle of evasions. It typically:

  • patches navigator.webdriver
  • adjusts plugins, languages, permissions
  • changes some Chrome/Headless quirks
  • can tweak WebGL / hairline / iframe checks depending on version

What it doesn’t do:

  • magically give you a clean IP
  • fix rate limiting
  • fix behavioral anomalies
  • guarantee your fingerprint is “real enough” for advanced checks

That’s why people get burned: they focus on stealth, ignore the crawl.


3) A baseline Puppeteer setup (headful, slow, sane)

Start with a stable baseline.

npm i puppeteer puppeteer-extra puppeteer-extra-plugin-stealth
import puppeteer from "puppeteer-extra";
import StealthPlugin from "puppeteer-extra-plugin-stealth";

puppeteer.use(StealthPlugin());

const sleep = (ms) => new Promise((r) => setTimeout(r, ms));

export async function run(url) {
  const browser = await puppeteer.launch({
    headless: false, // start headful while iterating
    args: [
      "--no-sandbox",
      "--disable-setuid-sandbox",
      "--lang=en-US,en",
    ],
  });

  const page = await browser.newPage();

  // Reasonable defaults
  await page.setViewport({ width: 1280, height: 800 });
  await page.setUserAgent(
    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36"
  );

  await page.goto(url, { waitUntil: "networkidle2", timeout: 60000 });
  await sleep(1200);

  const title = await page.title();
  console.log("title:", title);

  await browser.close();
}

run("https://example.com");

If this fails immediately with a block page, stealth isn’t your first problem.


4) Detect blocks programmatically (don’t guess)

You need hard signals in code:

  • response status (403, 429, 503)
  • specific block keywords in HTML
  • unexpected page titles
  • CAPTCHA markers

Capture main document response

function looksBlocked(html) {
  const t = (html || "").toLowerCase();
  return (
    t.includes("access denied") ||
    t.includes("unusual traffic") ||
    t.includes("verify you are human") ||
    t.includes("captcha")
  );
}

export async function fetchHtmlWithSignals(page, url) {
  const resp = await page.goto(url, { waitUntil: "domcontentloaded", timeout: 60000 });
  const status = resp?.status();

  const html = await page.content();
  const title = await page.title();

  return {
    url,
    status,
    title,
    blocked: (status && status >= 400) || looksBlocked(html) || /captcha/i.test(title),
    html,
  };
}

When you run crawls at scale, this is what powers:

  • retry policies
  • backoff
  • “switch IP/session” logic

5) The mistake: rotating fingerprints while keeping the same IP

A very common failure mode:

  • you randomize user agents
  • you randomize viewport
  • you randomize timezones

…but all requests still come from one IP or one small IP pool.

For many targets, IP reputation + request rate dominate.

Practical rule

  • If you’re blocked quickly across different fingerprints, you likely need better IP strategy and pacing.
  • If you’re blocked only on certain flows/pages, you likely need better behavior/session handling.

6) How to avoid getting your IP burned (crawl design)

Here are patterns that consistently reduce burn:

A) Slow down like a human (but consistently)

  • don’t burst 200 page loads in a minute
  • implement token-bucket rate limiting per domain

B) Reuse sessions (don’t look like 10,000 new users)

  • keep cookies for a while
  • keep a browser context per “identity”

C) Reduce page loads

  • don’t open PDPs you don’t need
  • scrape listing pages first, sample PDPs
  • cache responses

D) Avoid “hot” endpoints

  • some endpoints are aggressively protected
  • find alternate sources: JSON data, sitemaps, RSS, etc.

E) Use a stable proxy-backed fetch when you can

If you don’t need full JS interaction, a proxy-backed HTTP fetch is often:

  • cheaper
  • faster
  • less fingerprint-sensitive

That’s where ProxiesAPI can fit: same URL list, fewer headless runs.


7) When to use Puppeteer stealth vs other approaches

Here’s the practical decision table.

ProblemBest approachWhy
Server-rendered HTML, mild throttlingHTTP + retries + proxiesSimple + fast
JS-heavy pages, content requires renderingPlaywright/PuppeteerNeed real browser
Aggressive bot defense on navigationHybrid + careful pacingFull headless alone gets burned
You only need a dataset, not a UI journeyFind JSON endpoints / structured feedsMost stable

Stealth is one tool in the toolbox.


8) A safer “hybrid crawler” pattern

Use headless only where necessary.

  1. Fetch PLPs with HTTP (proxy-backed)
  2. Extract PDP URLs
  3. Fetch a small sample of PDPs with headless for validation
  4. If needed, only then expand headless coverage

This reduces your exposure dramatically.


Where ProxiesAPI fits (honestly)

Stealth plugins change browser signals.

But a lot of “bot detection pain” is network-layer:

  • too many requests from one IP
  • rate limiting
  • inconsistent routing

ProxiesAPI gives you a proxy-backed fetch URL and (depending on plan) optional rendering. Used well, it supports crawl patterns that complete more runs and burn fewer IPs.


Checklist: before you blame stealth

  • Are you rate limiting per domain?
  • Do you retry with backoff on 429/503?
  • Are you caching and deduping URLs?
  • Are you reusing sessions/cookies appropriately?
  • Are you detecting blocks and switching strategy?

Fix these first — then tune stealth.

When stealth isn’t enough, make the network layer stable

Stealth tweaks fingerprints. But many blocks are IP/rate-limit driven. ProxiesAPI gives you a proxy-backed fetch URL (and optional rendering) so you can design crawls that burn fewer IPs and complete more runs.

Related guides

Rotating Proxies Explained: How They Work + When You Need Them for Web Scraping
A practical guide to rotating proxies: what rotation means, common rotation patterns, sticky vs per-request IPs, and how to decide if rotating proxies are worth it for your scraper.
seo#rotating proxies#proxies#web-scraping
Data Scraping for E-Commerce: Price Monitoring + Competitive Intel (2026 Playbook)
A practical, end-to-end workflow for data scraping for e-commerce: discover URLs, capture prices/availability, normalize catalogs, detect changes, and turn it into daily competitive intel.
guide#data scraping for e-commerce#ecommerce scraping#price monitoring
Google Trends Scraping: API Options and DIY Methods (2026)
Compare official and unofficial ways to fetch Google Trends data, plus a DIY approach with throttling, retries, and proxy rotation for stability.
guide#google-trends#web-scraping#python
Web Scraping with Rust: reqwest + scraper Crate Tutorial (2026)
A practical Rust scraping guide: fetch pages with reqwest, rotate proxies, parse HTML with the scraper crate, handle retries/timeouts, and export structured data.
guide#rust#web-scraping#reqwest