engineering

4 guides

Retry Policies for Web Scrapers: What to Retry vs Fail Fast

Learn a production-safe retry strategy with status-code rules, backoff, and a Python helper you can drop into any scraper.

Soft-Block Detection for Web Scraping (Python): Catch ‘HTTP 200 but Wrong Page’

Most scrapers fail silently: the request succeeds but the HTML is a block/consent/login page. Here’s how to detect soft-blocks before parsing.

Free Proxy Lists vs a Proxy API: Why Free Breaks in Production

Free proxies look attractive — until your scraper scales. Here’s what fails first, what a proxy API actually fixes, and how to choose the right setup.

Retries, Timeouts, and Backoff for Web Scraping (Python): Production Defaults That Work

Most scrapers fail because of networking, not parsing. Here are sane timeout defaults, a retry policy that won’t DDoS a site, and a drop-in requests/httpx implementation.