Scrape FanDuel Odds and Lines with Python

Jun 19, 2026 · tutorial · #python, #fanduel, #sports-betting, #playwright, #json, #proxies

FanDuel is not a normal HTML scrape.

If you point BeautifulSoup at a sportsbook page and expect all visible odds to be sitting in the raw DOM, you will waste a lot of time. The practical workflow is:

open the page in a browser
capture the JSON/XHR traffic behind it
normalize the pricing payload into rows you can store
poll the same event endpoints to track line movement

This tutorial shows that pattern with Python.

We will collect:

matchup names
market names
selection names
American odds
start times
live snapshots for line-movement tracking

FanDuel Sportsbook request block encountered during live capture

Keep sportsbook collection resilient with ProxiesAPI

Sportsbook pages are dynamic, geo-sensitive, and often hostile to repetitive traffic. A ProxiesAPI-ready collection layer gives you a safer way to stabilize the fetch side while keeping your pricing parser reusable.

Get 1,000 free API calls View pricing

Why direct HTML scraping usually disappoints on FanDuel

Sportsbook pages are commonly:

hydrated client-side
backed by nested JSON payloads
personalized by region or jurisdiction
blocked when traffic does not look like a real browser session

So the reliable mental model is:

browser for discovery
JSON for extraction

That is also what makes the scraper maintainable. DOM classes change constantly; event payload structures usually change less often.

Install the dependencies

python3 -m venv .venv
source .venv/bin/activate
pip install playwright
playwright install chromium

We will use Playwright because it is good at observing network responses without forcing us to parse a huge rendered DOM.

Step 1: Capture the sportsbook JSON behind the page

The key move is to listen for network responses while the page loads.

from __future__ import annotations

import json
from pathlib import Path
from urllib.parse import urlparse

from playwright.sync_api import sync_playwright


TARGET_URL = "https://sportsbook.fanduel.com/navigation/nba"


def should_capture(url: str) -> bool:
    return "fanduel.com" in url and (
        "/cache/" in url
        or "content-managed-page" in url
        or "/api/" in url
    )


def capture_payloads(output_dir: str = "fanduel_payloads") -> list[dict]:
    Path(output_dir).mkdir(parents=True, exist_ok=True)
    captured = []

    with sync_playwright() as p:
        browser = p.chromium.launch(headless=True)
        page = browser.new_page(viewport={"width": 1440, "height": 1200})

        def handle_response(response):
            url = response.url
            if not should_capture(url):
                return

            try:
                content_type = response.headers.get("content-type", "")
                if "json" not in content_type:
                    return

                payload = response.json()
                name = urlparse(url).path.strip("/").replace("/", "_")[:120] + ".json"
                path = Path(output_dir) / name
                path.write_text(json.dumps(payload, indent=2), encoding="utf-8")
                captured.append({"url": url, "path": str(path)})
            except Exception:
                pass

        page.on("response", handle_response)
        page.goto(TARGET_URL, wait_until="domcontentloaded", timeout=60000)
        page.wait_for_timeout(8000)
        browser.close()

    return captured


if __name__ == "__main__":
    files = capture_payloads()
    print(f"captured {len(files)} JSON payloads")
    for item in files[:5]:
        print(item["url"])

This script does two things you want in production:

it keeps a local copy of the raw sportsbook payloads
it decouples later parsing work from live page access

That is a big deal when a site is geo-blocked or intermittently challenged.

Step 2: Convert raw price ratios into American odds

FanDuel-style payloads often carry raw price numerators and denominators instead of already formatted American lines.

def ratio_to_american(price_up: int, price_down: int) -> int | None:
    if not price_up or not price_down:
        return None

    if price_down < price_up:
        return int((price_up / price_down) * 100)
    return int((price_down / price_up) * -100)

Examples:

price_up	price_down	American odds
20	23	-115
23	20	+115
10	11	-110

This is the same idea you see when reverse engineering FanDuel payloads manually in DevTools.

Step 3: Normalize markets into dashboard rows

The payload shape changes over time, but a common structure is:

events
markets
selections

Here is a defensive parser that handles the common case without assuming every node exists.

from __future__ import annotations

import json
from pathlib import Path


def normalize_market_payload(path: str) -> list[dict]:
    payload = json.loads(Path(path).read_text(encoding="utf-8"))
    rows = []

    events = payload.get("events", [])
    for event in events:
        event_name = event.get("eventname") or (
            f"{event.get('participantname_away')} @ {event.get('participantname_home')}"
        )
        start_time = event.get("tsstart")
        sport = event.get("sportname")

        for market in event.get("markets", []):
            market_name = market.get("name")

            for selection in market.get("selections", []):
                rows.append(
                    {
                        "sport": sport,
                        "event_name": event_name,
                        "start_time": start_time,
                        "market_name": market_name,
                        "selection_name": selection.get("name"),
                        "handicap": selection.get("currenthandicap"),
                        "american_odds": ratio_to_american(
                            selection.get("currentpriceup"),
                            selection.get("currentpricedown"),
                        ),
                    }
                )

    return rows

If your captured payload is event-specific rather than page-wide, adapt the root walk to:

eventmarketgroups
markets
selections

The normalization principle is the same.

Step 4: Poll for line movement

Once you have the event or market endpoint, line tracking becomes a polling problem.

from __future__ import annotations

import csv
import time
from datetime import datetime, timezone

import requests


def snapshot_event_json(url: str) -> list[dict]:
    payload = requests.get(url, timeout=30).json()
    rows = []

    for group in payload.get("eventmarketgroups", []):
        for market in group.get("markets", []):
            for selection in market.get("selections", []):
                rows.append(
                    {
                        "captured_at": datetime.now(timezone.utc).isoformat(),
                        "event_name": market.get("eventname"),
                        "market_name": market.get("name"),
                        "selection_name": selection.get("name"),
                        "handicap": selection.get("currenthandicap"),
                        "american_odds": ratio_to_american(
                            selection.get("currentpriceup"),
                            selection.get("currentpricedown"),
                        ),
                    }
                )

    return rows


def poll_line_movement(event_url: str, out_csv: str, iterations: int = 10, sleep_seconds: int = 30) -> None:
    header_written = False

    for _ in range(iterations):
        rows = snapshot_event_json(event_url)
        with open(out_csv, "a", newline="", encoding="utf-8") as fh:
            writer = csv.DictWriter(fh, fieldnames=list(rows[0].keys()))
            if not header_written:
                writer.writeheader()
                header_written = True
            writer.writerows(rows)
        time.sleep(sleep_seconds)

That CSV becomes the raw input for:

movement charts
stale-line alerts
arbitrage comparisons
model backtesting

Practical anti-block notes

Sportsbooks are higher-friction targets than most tutorial sites, so be realistic:

Issue	What it means	Safer response
403 / CloudFront page	request never reached usable app content	stop retry storms; rotate IP/session
empty or incomplete HTML	data is hydrated by JS	capture network JSON with a browser
geo-specific gaps	certain markets differ by jurisdiction	record the region and keep runs separate
inconsistent payloads	same sport page mixes featured and event-specific markets	normalize everything into one row schema