Python Requests with Proxy: Setup and Rotation Guide
If you search for “python requests proxy”, you’ll see a lot of examples like:
requests.get(url, proxies={"http": "...", "https": "..."})
That’s not wrong — but it’s also not enough.
In real scraping workloads, your proxy setup has to survive:
- slow sites and random timeouts
- intermittent 5xx errors
- connection resets
- target-specific rate limits
- “soft blocks” where HTML is returned but it’s an interstitial / bot check
This guide is the setup I wish every scraper started with:
- correct Requests proxy configuration
- retries + backoff
- session rotation patterns
- timeouts (connect + read)
- a “proxy API” alternative using ProxiesAPI
If you’d rather avoid managing raw proxy hosts/ports inside your code, ProxiesAPI gives you a single fetch endpoint you can drop into any Requests-based scraper.
1) Requests proxy basics (single proxy)
Requests supports proxying via the proxies parameter:
import requests
proxies = {
"http": "http://127.0.0.1:8080",
"https": "http://127.0.0.1:8080",
}
r = requests.get("https://httpbin.org/ip", proxies=proxies, timeout=(10, 30))
r.raise_for_status()
print(r.json())
Notes:
- For HTTPS targets, Requests still uses the proxy URL under
https. - Use
timeout=(connect, read)so your program never hangs.
2) Authenticated proxies
If your proxy provider gives you user:pass@host:port, include it in the proxy URL.
proxies = {
"http": "http://USER:PASS@proxy.example.com:8000",
"https": "http://USER:PASS@proxy.example.com:8000",
}
r = requests.get("https://httpbin.org/ip", proxies=proxies, timeout=(10, 30))
print(r.status_code)
If you get 407 Proxy Authentication Required, your credentials are wrong or your provider expects a different auth method.
3) Use a Session (connection pooling + consistent headers)
For scraping, you almost always want a requests.Session():
- it reuses connections
- it centralizes headers
- it pairs naturally with retry logic
import requests
TIMEOUT = (10, 60)
session = requests.Session()
session.headers.update({
"User-Agent": (
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
"AppleWebKit/537.36 (KHTML, like Gecko) "
"Chrome/124.0.0.0 Safari/537.36"
),
"Accept-Language": "en-US,en;q=0.9",
})
r = session.get("https://example.com", timeout=TIMEOUT)
r.raise_for_status()
print(len(r.text))
4) Retries the right way (urllib3 Retry)
Requests is built on urllib3, and you can configure retries via an HTTPAdapter.
This gets you:
- retry on transient status codes
- limited retries (no infinite loops)
- backoff between attempts
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry
def build_session(*, proxies: dict | None = None) -> requests.Session:
s = requests.Session()
retry = Retry(
total=6,
connect=6,
read=6,
status=6,
backoff_factor=0.7,
status_forcelist=[429, 500, 502, 503, 504],
allowed_methods=["GET", "HEAD"],
raise_on_status=False,
)
adapter = HTTPAdapter(max_retries=retry, pool_connections=50, pool_maxsize=50)
s.mount("http://", adapter)
s.mount("https://", adapter)
if proxies:
s.proxies.update(proxies)
s.headers.update({
"User-Agent": (
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
"AppleWebKit/537.36 (KHTML, like Gecko) "
"Chrome/124.0.0.0 Safari/537.36"
)
})
return s
session = build_session(proxies={
"http": "http://USER:PASS@proxy.example.com:8000",
"https": "http://USER:PASS@proxy.example.com:8000",
})
r = session.get("https://httpbin.org/status/503", timeout=(10, 30))
print("final status:", r.status_code)
Two important realities:
- Retries help with flaky networks and overloaded targets.
- Retries do not “solve” hard blocking. They just reduce your error rate.
5) Rotation patterns (what people mean by “rotating proxies”)
“Proxy rotation” can mean three different things:
A) Provider-side rotation (best, simplest)
You use one gateway hostname and the provider rotates IPs behind it.
Your code stays stable.
B) Session/sticky rotation (rotate every N requests)
You keep a session identifier for a short time to reduce variance (useful for sites that dislike IP hopping).
If your provider supports session=... parameters in the username, you can do:
import random
def proxy_with_session(base_user: str, password: str, host: str, port: int, session_id: str) -> dict:
user = f"{base_user}-session-{session_id}"
proxy = f"http://{user}:{password}@{host}:{port}"
return {"http": proxy, "https": proxy}
session_id = str(random.randint(100000, 999999))
proxies = proxy_with_session("USER", "PASS", "proxy.example.com", 8000, session_id)
s = build_session(proxies=proxies)
print(s.get("https://httpbin.org/ip", timeout=(10, 30)).json())
(Your provider’s exact username format will differ.)
C) Client-side pool rotation (list of proxies)
You maintain a list of proxy endpoints and pick a new one per request or per batch.
This works, but adds operational overhead.
6) Detect “soft blocks” (200 OK but wrong HTML)
A lot of anti-bot pages return 200 with HTML that looks like:
- “Access Denied”
- captcha prompt
- JS challenge
So add a fast content check before you parse:
def looks_blocked(html: str) -> bool:
t = (html or "").lower()
return any(x in t for x in [
"access denied",
"verify you are human",
"captcha",
"unusual traffic",
])
html = session.get("https://example.com", timeout=(10, 60)).text
if looks_blocked(html):
raise RuntimeError("blocked")
This one habit saves hours.
7) The simpler alternative: ProxiesAPI fetch pattern
Sometimes you don’t want to manage raw proxies in Requests at all.
In that case, ProxiesAPI gives you a single fetch endpoint:
curl "http://api.proxiesapi.com/?key=API_KEY&url=https://example.com"
Python version:
import requests
from urllib.parse import quote_plus
TIMEOUT = (10, 60)
def fetch_via_proxiesapi(target_url: str, api_key: str) -> str:
url = f"http://api.proxiesapi.com/?key={quote_plus(api_key)}&url={quote_plus(target_url)}"
r = requests.get(url, timeout=TIMEOUT)
r.raise_for_status()
return r.text
html = fetch_via_proxiesapi("https://httpbin.org/ip", "API_KEY")
print(html[:200])
Tradeoffs:
- Pros: easy integration, fewer moving parts in your app
- Cons: less low-level control than a full proxy pool
8) A quick “production” checklist
Use this when your script moves from “toy” to “job”:
- timeouts on every request (connect + read)
- retries with backoff for 429/5xx
- content checks for soft blocks
- rotate proxies or sessions thoughtfully (not randomly every request)
- respect robots/ToS and rate limits
- store raw HTML for debugging when things break
Final thoughts
Most “python requests proxy” snippets on the internet get you to your first request.
What you want is a setup that gets you to your 10,000th request without falling apart.
Start with Sessions + timeouts + retries.
Then decide whether you want:
- raw proxies (more control)
- or a proxy API fetch approach (simpler integration) like ProxiesAPI.
If you’d rather avoid managing raw proxy hosts/ports inside your code, ProxiesAPI gives you a single fetch endpoint you can drop into any Requests-based scraper.