HTTP 429 Too Many Requests While Scraping: Causes, Fixes, and Retry Patterns

If you scrape at scale, HTTP 429 Too Many Requests isn’t an “edge case”.

It’s the server telling you:

  • “you’re too fast”
  • “your traffic looks automated”
  • or “your identity is rate-limited right now”

This post is a practical playbook for stopping 429s without cargo-culting random delays.

Reduce 429s by stabilizing the network layer

If you’re doing everything right and still seeing bursty failures, ProxiesAPI can help keep the fetch layer consistent — but you still need real rate limits and backoff.


What 429 actually means

HTTP 429 is returned when the server (or an upstream WAF/CDN) decides you’ve exceeded a limit.

That limit could be enforced by:

  • the application (API gateway)
  • the CDN (Cloudflare / Akamai / Fastly)
  • a bot defense layer (fingerprinting + heuristics)

Sometimes you’ll also get:

  • a Retry-After header
  • a JSON error body
  • a generic HTML “slow down” page

Common causes (ranked by how often they bite scrapers)

CauseWhat it looks likeFix
Too much concurrency429 spikes right after you increase workersCap concurrency per host
No backoffrepeated 429s for the same URL clusterExponential backoff + jitter
Identity-based limitsone IP/session gets blocked quicklyRotate identity + pace requests
Request bursts200s then sudden 429 waveToken bucket / leaky bucket limiter
Hot endpointslisting/search endpoints 429 faster than detail pagesSeparate budgets per endpoint type

Fix #1: Concurrency limits (the fastest win)

Most scrapers fail because they parallelize too aggressively.

Even a “small” scraper with 50 threads can hammer a single host.

Rule of thumb:

  • start with 2–4 concurrent requests per host
  • increase only after measuring success rate

If you scrape multiple sites, keep concurrency per host separate.


Fix #2: Honor Retry-After if present

If the server says “wait N seconds”, don’t guess.

def retry_after_seconds(resp) -> int | None:
    ra = resp.headers.get("Retry-After")
    if not ra:
        return None
    try:
        return int(ra)
    except ValueError:
        return None

Fix #3: Jittered exponential backoff (not just sleep(1))

Backoff without jitter creates “thundering herd” behavior if you have multiple workers.

Use:

  • exponential growth
  • plus random jitter
  • plus a max cap

A production-ready Python fetch wrapper (handles 429)

This wrapper:

  • uses ProxiesAPI by default (optional)
  • retries on 429/5xx/timeouts
  • respects Retry-After when present
  • adds jittered exponential backoff
import os
import time
import random
import urllib.parse
import requests

PROXIESAPI_KEY = os.environ.get("PROXIESAPI_KEY", "")
TIMEOUT = (10, 40)

session = requests.Session()


def proxiesapi_url(target_url: str) -> str:
    if not PROXIESAPI_KEY:
        raise RuntimeError("Set PROXIESAPI_KEY in your environment")

    return (
        "http://api.proxiesapi.com/?auth_key="
        + urllib.parse.quote(PROXIESAPI_KEY, safe="")
        + "&url="
        + urllib.parse.quote(target_url, safe="")
    )


def fetch_text(url: str, *, use_proxiesapi: bool = True, max_retries: int = 6) -> str:
    last_err = None

    for attempt in range(1, max_retries + 1):
        try:
            final_url = proxiesapi_url(url) if use_proxiesapi else url
            r = session.get(
                final_url,
                timeout=TIMEOUT,
                headers={
                    "User-Agent": (
                        "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
                        "AppleWebKit/537.36 (KHTML, like Gecko) "
                        "Chrome/123.0 Safari/537.36"
                    ),
                    "Accept-Language": "en-US,en;q=0.9",
                },
            )

            if r.status_code == 429:
                ra = retry_after_seconds(r)
                if ra is not None:
                    sleep_s = ra
                else:
                    sleep_s = min(30, (2 ** (attempt - 1))) + random.random() * 2
                time.sleep(sleep_s)
                continue

            if 500 <= r.status_code <= 599:
                raise RuntimeError(f"Upstream {r.status_code}")

            r.raise_for_status()

            text = r.text
            if not text or len(text) < 200:
                raise RuntimeError("Suspiciously small response")

            return text

        except Exception as e:
            last_err = e
            sleep_s = min(30, (2 ** (attempt - 1))) + random.random() * 2
            time.sleep(sleep_s)

    raise RuntimeError(f"Fetch failed after {max_retries} attempts: {last_err}")

Fix #4: Token bucket rate limiting (stop bursts)

If your scraper “fires” 100 requests at time 0, you’ll trigger a limit even if the average rate is okay.

Token bucket smooths that out:

  • you earn tokens at a steady rate
  • each request consumes a token
  • if no tokens, you wait

This is how many API clients are meant to behave.

Implementation depends on your architecture (async vs threads vs distributed workers), but conceptually:

  • keep a per-host bucket
  • set “tokens per second” and “burst size”

When proxies help (and when they make 429 worse)

Proxies can reduce 429s when:

  • limits are IP-based and you spread load responsibly
  • you need stable egress across long jobs

Proxies make things worse when:

  • you increase request rate because “we have proxies now”
  • the site rate limits by account/session/cookie (not IP)
  • you trigger bot defenses due to inconsistent identity signals

The correct mental model:

Proxies increase reliability, not permission, and not unlimited throughput.


Quick checklist for eliminating 429s

  • Cap concurrency per host (start 2–4)
  • Add jittered exponential backoff
  • Respect Retry-After
  • Separate budgets for listing vs detail endpoints
  • Use caching to avoid refetching unchanged pages
  • Only then consider proxy rotation for stability
Reduce 429s by stabilizing the network layer

If you’re doing everything right and still seeing bursty failures, ProxiesAPI can help keep the fetch layer consistent — but you still need real rate limits and backoff.

Related guides

How to Scrape Data Without Getting Blocked (A Practical Playbook)
A step-by-step anti-block strategy for web scraping: request fingerprinting, sessions, rate limits, retries, proxies, and when to use a real browser—without burning IPs or writing brittle code.
guide#web-scraping#anti-bot#rate-limiting
How to Scrape Data Without Getting Blocked (Practical Playbook)
A practical anti-blocking playbook for web scraping: rate limits, headers, retries, session handling, proxy rotation, browser fallback, and monitoring—plus proven Python patterns.
guide#web-scraping#anti-bot#proxies
Retry Policies for Web Scrapers: What to Retry vs Fail Fast
Learn a production-safe retry strategy with status-code rules, backoff, and a Python helper you can drop into any scraper.
engineering#python#web-scraping#retries
Retries, Timeouts, and Backoff for Web Scraping (Python): Production Defaults That Work
Most scrapers fail because of networking, not parsing. Here are sane timeout defaults, a retry policy that won’t DDoS a site, and a drop-in requests/httpx implementation.
engineering#python#web-scraping#retries