Scrape eBay Listings and Prices (Green List site)

eBay search pages are a great scraping exercise: the structure is consistent, pagination is explicit, and the data is useful for price tracking and market research.

The catch is real: direct repeated requests are often blocked with 403 responses. That is why this tutorial is built around a fetch layer you can route through ProxiesAPI from day one.

We will build a scraper that:

  • fetches search results
  • extracts title, price, shipping, seller, and URL
  • paginates multiple pages
  • exports a clean CSV

eBay search results (we will scrape listing cards)

Make eBay scraping less brittle with ProxiesAPI

eBay often blocks direct, repeated requests from a single IP. Keeping a clean fetch layer (and routing it through ProxiesAPI when needed) helps you scale searches and pagination without constantly reworking your code.


URL patterns and pagination

A common eBay search URL is:

https://www.ebay.com/sch/i.html?_nkw=iphone

Pagination typically uses _pgn:

https://www.ebay.com/sch/i.html?_nkw=iphone&_pgn=2


Setup

python3 -m venv .venv
source .venv/bin/activate
pip install requests beautifulsoup4 lxml

Step 1: Fetch layer (ProxiesAPI-friendly)

This section is intentionally structured so you can run with normal direct fetches while debugging, then switch to ProxiesAPI when you scale.

This uses the common wrapper format:

http://api.proxiesapi.com/?auth_key=YOUR_KEY&url=https://target.com/...

Set PROXIESAPI_KEY in your environment to enable it.

import os
import random
import time
from urllib.parse import quote, urlencode

import requests
from bs4 import BeautifulSoup

UA_POOL = [
    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0 Safari/537.36",
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0 Safari/537.36",
]

def proxiesapi_url(target_url: str) -> str:
    key = os.environ.get("PROXIESAPI_KEY")
    if not key:
        return target_url
    return f"http://api.proxiesapi.com/?auth_key={quote(key)}&url={quote(target_url, safe='')}"

session = requests.Session()


def fetch(
    url: str,
    *,
    use_proxiesapi: bool = True,
    timeout: tuple[int, int] = (10, 30),
    max_retries: int = 4,
) -> str:
    last_err: Exception | None = None
    for attempt in range(1, max_retries + 1):
        try:
            final = proxiesapi_url(url) if use_proxiesapi else url
            r = session.get(
                final,
                timeout=timeout,
                headers={
                    "User-Agent": random.choice(UA_POOL),
                    "Accept-Language": "en-US,en;q=0.9",
                },
            )
            r.raise_for_status()
            html = r.text
            if not html or len(html) < 2000:
                raise RuntimeError(f"Suspiciously small HTML ({len(html)} bytes)")
            return html
        except Exception as e:
            last_err = e
            if attempt == max_retries:
                break
            time.sleep(0.8 * (2 ** (attempt - 1)) + random.random() * 0.25)
    raise last_err or RuntimeError("fetch failed")

Step 2: Parse listings

eBay search pages commonly structure results with li.s-item. Useful inner selectors:

  • link: a.s-item__link
  • title: .s-item__title
  • price: .s-item__price
  • shipping: .s-item__shipping (often present, sometimes not)
  • condition: .SECONDARY_INFO (often present, sometimes not)
def clean_text(x: str | None) -> str | None:
    if x is None:
        return None
    t = " ".join(x.split()).strip()
    return t or None

def parse_search_results(html: str) -> list[dict]:
    soup = BeautifulSoup(html, "lxml")
    items = soup.select("li.s-item")

    out: list[dict] = []
    for it in items:
        a = it.select_one("a.s-item__link[href]")
        url = a.get("href") if a else None

        title_el = it.select_one(".s-item__title")
        title = clean_text(title_el.get_text(" ", strip=True) if title_el else None)
        if not title or title.lower() in {"shop on ebay", "results matching fewer words"}:
            continue

        price_el = it.select_one(".s-item__price")
        ship_el = it.select_one(".s-item__shipping")
        seller_el = it.select_one(".s-item__seller-info-text") or it.select_one(".s-item__seller-info")
        condition_el = it.select_one(".SECONDARY_INFO")

        out.append({
            "title": title,
            "price": clean_text(price_el.get_text(" ", strip=True) if price_el else None),
            "shipping": clean_text(ship_el.get_text(" ", strip=True) if ship_el else None),
            "seller": clean_text(seller_el.get_text(" ", strip=True) if seller_el else None),
            "condition": clean_text(condition_el.get_text(" ", strip=True) if condition_el else None),
            "url": url,
        })

    return out

Step 3: Pagination and export (CSV + JSON)

import csv
import json

def build_search_url(*, keyword: str, page: int) -> str:
    base = "https://www.ebay.com/sch/i.html"
    params = {"_nkw": keyword, "_pgn": str(page)}
    return f"{base}?{urlencode(params)}"

def crawl_search(keyword: str, *, pages: int = 3) -> list[dict]:
    seen: set[str] = set()
    all_rows: list[dict] = []
    for page in range(1, pages + 1):
        url = build_search_url(keyword=keyword, page=page)
        html = fetch(url)
        batch = parse_search_results(html)
        for row in batch:
            u = row.get("url") or ""
            if not u or u in seen:
                continue
            seen.add(u)
            all_rows.append(row)
        if not batch:
            break
    return all_rows

def write_csv(rows: list[dict], path: str) -> None:
    fieldnames = ["title", "price", "shipping", "seller", "condition", "url"]
    with open(path, "w", newline="", encoding="utf-8") as f:
        w = csv.DictWriter(f, fieldnames=fieldnames)
        w.writeheader()
        for r in rows:
            w.writerow({k: r.get(k) for k in fieldnames})

def write_json(rows: list[dict], path: str) -> None:
    with open(path, "w", encoding="utf-8") as f:
        json.dump(rows, f, ensure_ascii=False, indent=2)

if __name__ == "__main__":
    rows = crawl_search("iphone", pages=2)
    print("rows:", len(rows))
    write_csv(rows, "ebay_search_results.csv")
    write_json(rows, "ebay_search_results.json")
    print("wrote ebay_search_results.csv")
    print("wrote ebay_search_results.json")

Where ProxiesAPI fits

eBay is a site where the difference between a toy script and a useful pipeline is usually the network layer. Keep parsing/export pure, and make ProxiesAPI a switch in fetch; that lets you scale keywords and pages without repeatedly re-architecting.


Practical hardening checklist (so it keeps working)

If you want this to run daily (or for dozens of keywords), treat these as non-optional:

  • Don’t hammer pages: add small jitter between requests (even 0.5–1.5s helps).
  • Log failures: persist the URL + HTTP error so you can retry later instead of losing data.
  • Dedupe early: URLs can repeat across pages and “sponsored” modules.
  • Fail fast on bad HTML: if you keep parsing tiny/captcha pages, you’ll quietly write garbage data.
  • Keep selectors minimal: eBay changes UI often; fewer selectors survive longer.

When you already have clean functions (fetch → parse → export), swapping use_proxiesapi on/off is one parameter — and you keep the rest of your pipeline stable.

Make eBay scraping less brittle with ProxiesAPI

eBay often blocks direct, repeated requests from a single IP. Keeping a clean fetch layer (and routing it through ProxiesAPI when needed) helps you scale searches and pagination without constantly reworking your code.

Related guides

Scrape eBay Listings + Sold Prices with Python (Active + Completed Listings)
Build a small eBay dataset (title, price, condition, shipping) from search results, then pull completed/sold prices from the Sold filter. Includes pagination, CSV export, and ProxiesAPI in the fetch layer.
tutorial#python#ebay#web-scraping
Scrape Book Reviews and Ratings from Goodreads
Extract Goodreads review text, star ratings, review counts, and reviewer metadata for a clean book-sentiment dataset.
tutorial#python#goodreads#web-scraping
Scrape Financial Data from Yahoo Finance (Green List site)
Fetch a quote page via ProxiesAPI, parse price + key stats, and export to CSV (with a screenshot).
tutorial#python#yahoo-finance#stocks
Scrape Book Data from Goodreads with Python (List Pages + Pagination)
Scrape Goodreads list pages for title/author/rating/reviews with Python: fetch via ProxiesAPI, parse real HTML selectors, paginate safely, and export CSV/JSON.
tutorial#python#goodreads#books