Web Scraping Queues: Concurrency, Retries, and Backpressure in Production

A web scraping queue is not just a list of URLs waiting for workers. In production, the queue is the control system that decides:

  • how many requests run at once
  • when a failed job should retry
  • when workers should slow down
  • how to avoid flooding one domain while another sits idle

If you skip that control system, the scraper usually fails in one of two ways:

  • it overwhelms the target and gets blocked
  • it overwhelms itself with retries, memory growth, or stuck workers

This guide covers the production basics: concurrency, retries, and backpressure, with practical patterns you can implement quickly.

A stronger fetch layer only helps if your queue behaves sensibly

ProxiesAPI can make outbound requests more reliable, but your queue still needs bounded concurrency, retry discipline, and backpressure. Otherwise you just fail faster.


The three jobs your scraping queue must do

At minimum, a queue for scraping should do three things well:

JobWhat it meansFailure if missing
schedule workdecide which URL runs nexthot targets dominate the queue
control concurrencycap how many workers run togetheryou self-DDoS or get blocked
absorb failureretry safely without retry stormstransient errors become outages

Everything else is optional compared with those three.


Concurrency should be bounded, not "as fast as possible"

A common beginner mistake is launching a huge number of workers because the machine can handle it. That is the wrong limit.

The real limits are:

  • target-site tolerance
  • proxy pool capacity
  • database write throughput
  • parser CPU cost

That means concurrency should be bounded globally and, ideally, per domain.

Queue policyResult
unlimited concurrencybursty failures, blocks, unstable latencies
small fixed concurrencystable, predictable runs
per-domain concurrency capsbetter fairness and fewer hot-spot bans

In practice, many scrapers become healthier when concurrency goes down, not up.


Retries should be selective

Retries are necessary, but not every error deserves one.

ConditionRetry?Reason
timeoutyesoften transient
429 rate limitedyes, with longer delaytarget is asking you to slow down
500/502/503/504yesupstream instability
parser bugnocode will fail the same way again
hard 404usually nolikely permanent
repeated captcha/challenge pageno immediate tight retryneeds slower policy or different routing

This is the core principle: retry transport failures, not logic failures.


Backpressure is how the system tells itself to slow down

Backpressure means your system can detect overload and reduce the rate of new work instead of pretending everything is fine.

In scraping, overload usually shows up as:

  • queue length growing faster than it drains
  • worker latency climbing
  • rising 429 or 5xx rates
  • database writes falling behind
  • proxy errors increasing

Without backpressure, operators respond by adding more workers, which often makes the incident worse.


A practical queue shape

A solid production queue often looks like this:

  1. producer discovers or refreshes URLs
  2. scheduler ranks them and pushes jobs into a work queue
  3. workers fetch and parse
  4. failures go to a retry queue with delay metadata
  5. poison jobs go to a dead-letter queue

This separation matters because "not successful yet" and "should never be retried immediately" are not the same state.


Example: bounded worker pool with retry metadata

Here is a compact Python example using asyncio to show the control flow:

from __future__ import annotations

import asyncio
import random
from dataclasses import dataclass, field
from time import monotonic


@dataclass
class Job:
    url: str
    domain: str
    attempt: int = 1
    next_run_at: float = field(default_factory=monotonic)


MAX_CONCURRENCY = 8
MAX_RETRIES = 4
PER_DOMAIN_LIMIT = 2

domain_semaphores: dict[str, asyncio.Semaphore] = {}


def domain_gate(domain: str) -> asyncio.Semaphore:
    if domain not in domain_semaphores:
        domain_semaphores[domain] = asyncio.Semaphore(PER_DOMAIN_LIMIT)
    return domain_semaphores[domain]


async def fetch(job: Job) -> str:
    # Replace with real HTTP work.
    await asyncio.sleep(0.3)
    if random.random() < 0.15:
        raise TimeoutError("transient timeout")
    return f"<html>{job.url}</html>"


async def handle(job: Job, retry_queue: asyncio.PriorityQueue) -> None:
    async with domain_gate(job.domain):
        html = await fetch(job)
        print("fetched", job.url, "bytes", len(html))


async def worker(work_queue: asyncio.Queue, retry_queue: asyncio.PriorityQueue) -> None:
    while True:
        job = await work_queue.get()
        try:
            await handle(job, retry_queue)
        except TimeoutError:
            if job.attempt < MAX_RETRIES:
                delay = min(2 ** job.attempt, 60) + random.random()
                retry_job = Job(
                    url=job.url,
                    domain=job.domain,
                    attempt=job.attempt + 1,
                    next_run_at=monotonic() + delay,
                )
                await retry_queue.put((retry_job.next_run_at, retry_job))
            else:
                print("dead-letter", job.url)
        finally:
            work_queue.task_done()

This example shows the important parts:

  • bounded global concurrency
  • per-domain concurrency caps
  • delayed retries instead of instant loops
  • a dead-letter outcome after enough failures

Why immediate retries are dangerous

Immediate retries create retry storms:

  • the same unstable target gets hit again instantly
  • workers stay occupied by the same failing jobs
  • fresh work never gets a chance

That is why retry queues need scheduled delays, not just "put it back at the end."

Exponential backoff with jitter is the default safe choice.


Backpressure rules worth implementing early

You do not need a full distributed systems thesis. A few simple rules solve most real issues.

SignalBackpressure action
queue size above thresholdpause discovery or reduce enqueue rate
429 rate spikes on one domainlower that domain's concurrency
average fetch latency doublesreduce global concurrency
retry queue dominates total workstop adding fresh low-value jobs
database lag risesslow workers before writes start failing

These rules turn overload from a surprise into a managed state.


Separate high-value jobs from background jobs

Not every URL should compete in the same pool.

Good examples of separate lanes:

  • urgent refresh jobs for high-value product pages
  • normal scheduled recrawls
  • low-priority discovery jobs
  • heavy browser jobs that need Playwright

If you mix all of those together, simple HTML tasks get stuck behind expensive browser work, and the queue stops feeling predictable.


Where ProxiesAPI fits

ProxiesAPI belongs in the fetch layer, not the queue layer.

That means:

  • workers decide what to fetch and when
  • the fetch layer decides how to route the request
  • parser logic stays unchanged

This separation is useful because queue behavior problems are rarely fixed by proxy routing alone. If the queue is unbounded or retries are undisciplined, better networking just lets the system misbehave more efficiently.


Production checklist

Before calling your scraper "production-ready," check these:

CapabilityWhy it matters
bounded global concurrencyprevents self-inflicted spikes
per-domain concurrency capsprotects targets and lowers ban risk
delayed retry queueavoids retry storms
dead-letter handlingstops hopeless jobs from looping forever
queue metricslets you see overload before users do
priority lanesprotects important jobs from noisy background work

This is the boring engineering that keeps scrapers alive.


The practical takeaway

If you are designing a web scraping queue, do not start with the question "How many workers can I run?"

Start with:

  1. what work deserves priority
  2. how much concurrency each target can tolerate
  3. which failures deserve retries
  4. what signal should make the system slow down

That is the difference between a scraper that runs fast in a demo and a scraper that survives in production for months.

A stronger fetch layer only helps if your queue behaves sensibly

ProxiesAPI can make outbound requests more reliable, but your queue still needs bounded concurrency, retry discipline, and backpressure. Otherwise you just fail faster.

Related guides

How to Scrape Data Without Getting Blocked (Practical Playbook)
A practical anti-blocking playbook: pacing, headers, retries, proxy rotation, browser fallback, and monitoring. Includes Python patterns you can reuse in production.
guide#how to scrape data without getting blocked#web scraping#python
Crawl Budget for Web Scraping: How to Prioritize URLs and Avoid Waste
A practical guide to crawl budget for web scraping: rank URLs by value, reduce useless recrawls, and spend requests where freshness actually changes business outcomes.
seo#crawl budget for web scraping#web scraping#crawler design
Price Scraping: How to Monitor Competitor Prices Automatically
A practical blueprint for price scraping and competitor price monitoring: what to track, how to crawl responsibly, change detection, and how to keep scrapers stable at scale.
seo#price scraping#price monitoring#web scraping
Proxy List Guide: Why Public Lists Fail for Web Scraping
Explain the tradeoffs of raw proxy lists versus managed rotation, validation, and retry layers for production scraping.
guide#proxy list#web scraping#proxies