How to Scrape AutoTrader Used Car Listings with Python (Make/Model/Price/Mileage)

Mar 18, 2026 · tutorial · #python, #autotrader, #cars, #web-scraping, #requests, #beautifulsoup, #proxies

AutoTrader results pages are packed with useful data:

listing title (year/make/model/trim)
price
mileage
location
dealer vs private seller signals

AutoTrader used car listings page (we'll scrape result cards)

In this tutorial we’ll build a scraper that turns an AutoTrader search into structured JSON using requests + BeautifulSoup.

We’ll also do this the “production way”: timeouts, retries, and selectors that degrade gracefully.

Keep listing scrapes stable with ProxiesAPI

Classifieds sites can be sensitive to request volume and repeated searches. ProxiesAPI lets you proxy-fetch result pages via a single URL so you can focus on parsing + data quality instead of proxy plumbing.

Get 1,000 free API calls View pricing

What we’re scraping

AutoTrader search results are typically under a URL like:

https://www.autotrader.com/cars-for-sale/all-cars?zip=10001&startYear=2018&endYear=2026&makeCodeList=TOYOTA&modelCodeList=CAMRY

(Parameters vary by region/search.)

We’ll scrape result cards, not individual listing pages. That keeps the request count lower and is enough for most datasets.

Setup

python -m venv .venv
source .venv/bin/activate
pip install requests beautifulsoup4 lxml

Step 1: Fetch HTML through ProxiesAPI

Basic curl sanity-check:

API_KEY="YOUR_PROXIESAPI_KEY"
TARGET="https://www.autotrader.com/cars-for-sale/all-cars?zip=10001&startYear=2018&endYear=2026&makeCodeList=TOYOTA&modelCodeList=CAMRY"

curl -s "http://api.proxiesapi.com/?key=$API_KEY&url=$TARGET" | head -n 20

Python fetch wrapper:

import time
import urllib.parse
import requests

API_KEY = "YOUR_PROXIESAPI_KEY"
TIMEOUT = (10, 60)

session = requests.Session()
session.headers.update({
    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0 Safari/537.36",
    "Accept-Language": "en-US,en;q=0.9",
})


def proxiesapi_url(target_url: str) -> str:
    return "http://api.proxiesapi.com/?" + urllib.parse.urlencode({
        "key": API_KEY,
        "url": target_url,
    })


def fetch_html(target_url: str, retries: int = 3, backoff: float = 2.0) -> str:
    url = proxiesapi_url(target_url)
    last_err = None

    for attempt in range(1, retries + 1):
        try:
            r = session.get(url, timeout=TIMEOUT)
            r.raise_for_status()

            if len(r.text) < 15_000:
                raise RuntimeError(f"Suspiciously small response: {len(r.text)} bytes")

            return r.text
        except Exception as e:
            last_err = e
            sleep_s = backoff ** attempt
            print(f"attempt {attempt}/{retries} failed: {e} -> sleeping {sleep_s:.1f}s")
            time.sleep(sleep_s)

    raise RuntimeError(f"Failed after {retries} retries: {last_err}")

Step 2: Identify stable selectors

AutoTrader is more JS-heavy than some sites, but result pages often still contain useful server-rendered HTML.

A common pattern is that each listing card is wrapped with a data-testid attribute.

In your first run, do:

from bs4 import BeautifulSoup

html = fetch_html("https://www.autotrader.com/cars-for-sale/all-cars?zip=10001&startYear=2018&endYear=2026&makeCodeList=TOYOTA&modelCodeList=CAMRY")

soup = BeautifulSoup(html, "lxml")
print("title:", soup.title.get_text(strip=True) if soup.title else None)

# Probe a few likely patterns
print("cards-testid:", len(soup.select('[data-testid*="listing"]')))
print("cards-article:", len(soup.select("article")))

If the HTML is mostly scripts and you don’t see listing text at all, you’ll need a browser automation approach. But before you go that route, verify your URL is a real public results page and you’re not getting a “blocked” response.

Step 3: Parse listing cards

We’ll extract:

title (often includes year/make/model)
price
mileage
location
a listing URL

We’ll keep values as text and normalize later (because mileage/price formatting varies).

import re
from bs4 import BeautifulSoup

BASE = "https://www.autotrader.com"


def clean_text(x: str | None) -> str | None:
    if not x:
        return None
    x = re.sub(r"\s+", " ", x).strip()
    return x or None


def parse_listings(html: str) -> list[dict]:
    soup = BeautifulSoup(html, "lxml")

    # Prefer explicit testid cards if present
    cards = soup.select('[data-testid*="listing-card"], [data-testid*="inventory-listing"], article')

    out = []
    for c in cards:
        # Title
        title_el = c.select_one("h2") or c.select_one("h3")
        title = clean_text(title_el.get_text(" ", strip=True) if title_el else None)

        # Price
        price_el = (
            c.select_one('[data-testid*="price"]')
            or c.find(string=re.compile(r"\$\s?\d"))
        )
        price = None
        if price_el:
            price = clean_text(price_el.get_text(" ", strip=True) if hasattr(price_el, "get_text") else str(price_el))

        # Mileage (often like "23,451 miles")
        mileage_el = c.find(string=re.compile(r"miles", re.I))
        mileage = clean_text(str(mileage_el)) if mileage_el else None

        # Location
        location_el = (
            c.select_one('[data-testid*="location"]')
            or c.find(string=re.compile(r"\b[A-Z]{2}\b", re.I))
        )
        location = None
        if location_el:
            location = clean_text(location_el.get_text(" ", strip=True) if hasattr(location_el, "get_text") else str(location_el))

        # Link
        a = c.select_one('a[href*="/cars-for-sale/vehicledetails"]') or c.select_one('a[href^="/"]')
        href = a.get("href") if a else None
        if href and href.startswith("/"):
            href = BASE + href.split("?")[0]

        if not title and not price and not href:
            continue

        out.append({
            "title": title,
            "price_text": price,
            "mileage_text": mileage,
            "location_text": location,
            "url": href,
        })

    # De-dupe by URL/title
    seen = set()
    uniq = []
    for item in out:
        key = item.get("url") or item.get("title")
        if not key or key in seen:
            continue
        seen.add(key)
        uniq.append(item)

    return uniq

Terminal-style run

if __name__ == "__main__":
    target = "https://www.autotrader.com/cars-for-sale/all-cars?zip=10001&startYear=2018&endYear=2026&makeCodeList=TOYOTA&modelCodeList=CAMRY"
    html = fetch_html(target)
    items = parse_listings(html)

    print("listings:", len(items))
    for it in items[:5]:
        print(it)

Example output:

listings: 23
{'title': '2021 Toyota Camry SE', 'price_text': '$23,995', 'mileage_text': '34,210 miles', 'location_text': 'Brooklyn, NY', 'url': 'https://www.autotrader.com/cars-for-sale/vehicledetails.xhtml?...'}
...

Export to JSON

import json

with open("autotrader_listings.json", "w", encoding="utf-8") as f:
    json.dump(items, f, ensure_ascii=False, indent=2)

print("wrote autotrader_listings.json", len(items))