Web Scraping Queues: Concurrency, Retries, and Backpressure in Production

Jul 05, 2026 · seo · #web scraping queue, #web scraping, #concurrency, #retries, #backpressure, #python, #operations, #proxies

A web scraping queue is not just a list of URLs waiting for workers. In production, the queue is the control system that decides:

how many requests run at once
when a failed job should retry
when workers should slow down
how to avoid flooding one domain while another sits idle

If you skip that control system, the scraper usually fails in one of two ways:

it overwhelms the target and gets blocked
it overwhelms itself with retries, memory growth, or stuck workers

This guide covers the production basics: concurrency, retries, and backpressure, with practical patterns you can implement quickly.

A stronger fetch layer only helps if your queue behaves sensibly

ProxiesAPI can make outbound requests more reliable, but your queue still needs bounded concurrency, retry discipline, and backpressure. Otherwise you just fail faster.

Get 1,000 free API calls View pricing

The three jobs your scraping queue must do

At minimum, a queue for scraping should do three things well:

Job	What it means	Failure if missing
schedule work	decide which URL runs next	hot targets dominate the queue
control concurrency	cap how many workers run together	you self-DDoS or get blocked
absorb failure	retry safely without retry storms	transient errors become outages

Everything else is optional compared with those three.

Concurrency should be bounded, not "as fast as possible"

A common beginner mistake is launching a huge number of workers because the machine can handle it. That is the wrong limit.

The real limits are:

target-site tolerance
proxy pool capacity
database write throughput
parser CPU cost

That means concurrency should be bounded globally and, ideally, per domain.

Queue policy	Result
unlimited concurrency	bursty failures, blocks, unstable latencies
small fixed concurrency	stable, predictable runs
per-domain concurrency caps	better fairness and fewer hot-spot bans

In practice, many scrapers become healthier when concurrency goes down, not up.

Retries should be selective

Retries are necessary, but not every error deserves one.

Condition	Retry?	Reason
timeout	yes	often transient
429 rate limited	yes, with longer delay	target is asking you to slow down
500/502/503/504	yes	upstream instability
parser bug	no	code will fail the same way again
hard 404	usually no	likely permanent
repeated captcha/challenge page	no immediate tight retry	needs slower policy or different routing

This is the core principle: retry transport failures, not logic failures.

Backpressure is how the system tells itself to slow down

Backpressure means your system can detect overload and reduce the rate of new work instead of pretending everything is fine.

In scraping, overload usually shows up as:

queue length growing faster than it drains
worker latency climbing
rising 429 or 5xx rates
database writes falling behind
proxy errors increasing

Without backpressure, operators respond by adding more workers, which often makes the incident worse.

A practical queue shape

A solid production queue often looks like this:

producer discovers or refreshes URLs
scheduler ranks them and pushes jobs into a work queue
workers fetch and parse
failures go to a retry queue with delay metadata
poison jobs go to a dead-letter queue

This separation matters because "not successful yet" and "should never be retried immediately" are not the same state.

Example: bounded worker pool with retry metadata

Here is a compact Python example using asyncio to show the control flow:

from __future__ import annotations

import asyncio
import random
from dataclasses import dataclass, field
from time import monotonic


@dataclass
class Job:
    url: str
    domain: str
    attempt: int = 1
    next_run_at: float = field(default_factory=monotonic)


MAX_CONCURRENCY = 8
MAX_RETRIES = 4
PER_DOMAIN_LIMIT = 2

domain_semaphores: dict[str, asyncio.Semaphore] = {}


def domain_gate(domain: str) -> asyncio.Semaphore:
    if domain not in domain_semaphores:
        domain_semaphores[domain] = asyncio.Semaphore(PER_DOMAIN_LIMIT)
    return domain_semaphores[domain]


async def fetch(job: Job) -> str:
    # Replace with real HTTP work.
    await asyncio.sleep(0.3)
    if random.random() < 0.15:
        raise TimeoutError("transient timeout")
    return f"<html>{job.url}</html>"


async def handle(job: Job, retry_queue: asyncio.PriorityQueue) -> None:
    async with domain_gate(job.domain):
        html = await fetch(job)
        print("fetched", job.url, "bytes", len(html))


async def worker(work_queue: asyncio.Queue, retry_queue: asyncio.PriorityQueue) -> None:
    while True:
        job = await work_queue.get()
        try:
            await handle(job, retry_queue)
        except TimeoutError:
            if job.attempt < MAX_RETRIES:
                delay = min(2 ** job.attempt, 60) + random.random()
                retry_job = Job(
                    url=job.url,
                    domain=job.domain,
                    attempt=job.attempt + 1,
                    next_run_at=monotonic() + delay,
                )
                await retry_queue.put((retry_job.next_run_at, retry_job))
            else:
                print("dead-letter", job.url)
        finally:
            work_queue.task_done()

This example shows the important parts:

bounded global concurrency
per-domain concurrency caps
delayed retries instead of instant loops
a dead-letter outcome after enough failures

Why immediate retries are dangerous

Immediate retries create retry storms:

the same unstable target gets hit again instantly
workers stay occupied by the same failing jobs
fresh work never gets a chance

That is why retry queues need scheduled delays, not just "put it back at the end."

Exponential backoff with jitter is the default safe choice.

Backpressure rules worth implementing early

You do not need a full distributed systems thesis. A few simple rules solve most real issues.

Signal	Backpressure action
queue size above threshold	pause discovery or reduce enqueue rate
429 rate spikes on one domain	lower that domain's concurrency
average fetch latency doubles	reduce global concurrency
retry queue dominates total work	stop adding fresh low-value jobs
database lag rises	slow workers before writes start failing

These rules turn overload from a surprise into a managed state.

Separate high-value jobs from background jobs

Not every URL should compete in the same pool.

Good examples of separate lanes:

urgent refresh jobs for high-value product pages
normal scheduled recrawls
low-priority discovery jobs
heavy browser jobs that need Playwright

If you mix all of those together, simple HTML tasks get stuck behind expensive browser work, and the queue stops feeling predictable.

Where ProxiesAPI fits

ProxiesAPI belongs in the fetch layer, not the queue layer.

That means:

workers decide what to fetch and when
the fetch layer decides how to route the request
parser logic stays unchanged

This separation is useful because queue behavior problems are rarely fixed by proxy routing alone. If the queue is unbounded or retries are undisciplined, better networking just lets the system misbehave more efficiently.

Production checklist

Before calling your scraper "production-ready," check these:

Capability	Why it matters
bounded global concurrency	prevents self-inflicted spikes
per-domain concurrency caps	protects targets and lowers ban risk
delayed retry queue	avoids retry storms
dead-letter handling	stops hopeless jobs from looping forever
queue metrics	lets you see overload before users do
priority lanes	protects important jobs from noisy background work

This is the boring engineering that keeps scrapers alive.

The practical takeaway

If you are designing a web scraping queue, do not start with the question "How many workers can I run?"

Start with:

what work deserves priority
how much concurrency each target can tolerate
which failures deserve retries
what signal should make the system slow down

That is the difference between a scraper that runs fast in a demo and a scraper that survives in production for months.

A stronger fetch layer only helps if your queue behaves sensibly

ProxiesAPI can make outbound requests more reliable, but your queue still needs bounded concurrency, retry discipline, and backpressure. Otherwise you just fail faster.

Get 1,000 free API calls View pricing

A practical anti-blocking playbook: pacing, headers, retries, proxy rotation, browser fallback, and monitoring. Includes Python patterns you can reuse in production.

guide#how to scrape data without getting blocked#web scraping#python

Crawl Budget for Web Scraping: How to Prioritize URLs and Avoid Waste

A practical guide to crawl budget for web scraping: rank URLs by value, reduce useless recrawls, and spend requests where freshness actually changes business outcomes.

seo#crawl budget for web scraping#web scraping#crawler design

Price Scraping: How to Monitor Competitor Prices Automatically

A practical blueprint for price scraping and competitor price monitoring: what to track, how to crawl responsibly, change detection, and how to keep scrapers stable at scale.

seo#price scraping#price monitoring#web scraping

Proxy List Guide: Why Public Lists Fail for Web Scraping

Explain the tradeoffs of raw proxy lists versus managed rotation, validation, and retry layers for production scraping.

guide#proxy list#web scraping#proxies

Web Scraping Queues: Concurrency, Retries, and Backpressure in Production

Related guides