HTTP 429 Too Many Requests While Scraping: Causes, Fixes, and Retry Patterns
If you scrape at scale, HTTP 429 Too Many Requests isn’t an “edge case”.
It’s the server telling you:
- “you’re too fast”
- “your traffic looks automated”
- or “your identity is rate-limited right now”
This post is a practical playbook for stopping 429s without cargo-culting random delays.
If you’re doing everything right and still seeing bursty failures, ProxiesAPI can help keep the fetch layer consistent — but you still need real rate limits and backoff.
What 429 actually means
HTTP 429 is returned when the server (or an upstream WAF/CDN) decides you’ve exceeded a limit.
That limit could be enforced by:
- the application (API gateway)
- the CDN (Cloudflare / Akamai / Fastly)
- a bot defense layer (fingerprinting + heuristics)
Sometimes you’ll also get:
- a
Retry-Afterheader - a JSON error body
- a generic HTML “slow down” page
Common causes (ranked by how often they bite scrapers)
| Cause | What it looks like | Fix |
|---|---|---|
| Too much concurrency | 429 spikes right after you increase workers | Cap concurrency per host |
| No backoff | repeated 429s for the same URL cluster | Exponential backoff + jitter |
| Identity-based limits | one IP/session gets blocked quickly | Rotate identity + pace requests |
| Request bursts | 200s then sudden 429 wave | Token bucket / leaky bucket limiter |
| Hot endpoints | listing/search endpoints 429 faster than detail pages | Separate budgets per endpoint type |
Fix #1: Concurrency limits (the fastest win)
Most scrapers fail because they parallelize too aggressively.
Even a “small” scraper with 50 threads can hammer a single host.
Rule of thumb:
- start with 2–4 concurrent requests per host
- increase only after measuring success rate
If you scrape multiple sites, keep concurrency per host separate.
Fix #2: Honor Retry-After if present
If the server says “wait N seconds”, don’t guess.
def retry_after_seconds(resp) -> int | None:
ra = resp.headers.get("Retry-After")
if not ra:
return None
try:
return int(ra)
except ValueError:
return None
Fix #3: Jittered exponential backoff (not just sleep(1))
Backoff without jitter creates “thundering herd” behavior if you have multiple workers.
Use:
- exponential growth
- plus random jitter
- plus a max cap
A production-ready Python fetch wrapper (handles 429)
This wrapper:
- uses ProxiesAPI by default (optional)
- retries on 429/5xx/timeouts
- respects
Retry-Afterwhen present - adds jittered exponential backoff
import os
import time
import random
import urllib.parse
import requests
PROXIESAPI_KEY = os.environ.get("PROXIESAPI_KEY", "")
TIMEOUT = (10, 40)
session = requests.Session()
def proxiesapi_url(target_url: str) -> str:
if not PROXIESAPI_KEY:
raise RuntimeError("Set PROXIESAPI_KEY in your environment")
return (
"http://api.proxiesapi.com/?auth_key="
+ urllib.parse.quote(PROXIESAPI_KEY, safe="")
+ "&url="
+ urllib.parse.quote(target_url, safe="")
)
def fetch_text(url: str, *, use_proxiesapi: bool = True, max_retries: int = 6) -> str:
last_err = None
for attempt in range(1, max_retries + 1):
try:
final_url = proxiesapi_url(url) if use_proxiesapi else url
r = session.get(
final_url,
timeout=TIMEOUT,
headers={
"User-Agent": (
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
"AppleWebKit/537.36 (KHTML, like Gecko) "
"Chrome/123.0 Safari/537.36"
),
"Accept-Language": "en-US,en;q=0.9",
},
)
if r.status_code == 429:
ra = retry_after_seconds(r)
if ra is not None:
sleep_s = ra
else:
sleep_s = min(30, (2 ** (attempt - 1))) + random.random() * 2
time.sleep(sleep_s)
continue
if 500 <= r.status_code <= 599:
raise RuntimeError(f"Upstream {r.status_code}")
r.raise_for_status()
text = r.text
if not text or len(text) < 200:
raise RuntimeError("Suspiciously small response")
return text
except Exception as e:
last_err = e
sleep_s = min(30, (2 ** (attempt - 1))) + random.random() * 2
time.sleep(sleep_s)
raise RuntimeError(f"Fetch failed after {max_retries} attempts: {last_err}")
Fix #4: Token bucket rate limiting (stop bursts)
If your scraper “fires” 100 requests at time 0, you’ll trigger a limit even if the average rate is okay.
Token bucket smooths that out:
- you earn tokens at a steady rate
- each request consumes a token
- if no tokens, you wait
This is how many API clients are meant to behave.
Implementation depends on your architecture (async vs threads vs distributed workers), but conceptually:
- keep a per-host bucket
- set “tokens per second” and “burst size”
When proxies help (and when they make 429 worse)
Proxies can reduce 429s when:
- limits are IP-based and you spread load responsibly
- you need stable egress across long jobs
Proxies make things worse when:
- you increase request rate because “we have proxies now”
- the site rate limits by account/session/cookie (not IP)
- you trigger bot defenses due to inconsistent identity signals
The correct mental model:
Proxies increase reliability, not permission, and not unlimited throughput.
Quick checklist for eliminating 429s
- Cap concurrency per host (start 2–4)
- Add jittered exponential backoff
- Respect
Retry-After - Separate budgets for listing vs detail endpoints
- Use caching to avoid refetching unchanged pages
- Only then consider proxy rotation for stability
If you’re doing everything right and still seeing bursty failures, ProxiesAPI can help keep the fetch layer consistent — but you still need real rate limits and backoff.