Scrape FanDuel Odds and Lines with Python

FanDuel is not a normal HTML scrape.

If you point BeautifulSoup at a sportsbook page and expect all visible odds to be sitting in the raw DOM, you will waste a lot of time. The practical workflow is:

  1. open the page in a browser
  2. capture the JSON/XHR traffic behind it
  3. normalize the pricing payload into rows you can store
  4. poll the same event endpoints to track line movement

This tutorial shows that pattern with Python.

We will collect:

  • matchup names
  • market names
  • selection names
  • American odds
  • start times
  • live snapshots for line-movement tracking

FanDuel Sportsbook request block encountered during live capture

Keep sportsbook collection resilient with ProxiesAPI

Sportsbook pages are dynamic, geo-sensitive, and often hostile to repetitive traffic. A ProxiesAPI-ready collection layer gives you a safer way to stabilize the fetch side while keeping your pricing parser reusable.


Why direct HTML scraping usually disappoints on FanDuel

Sportsbook pages are commonly:

  • hydrated client-side
  • backed by nested JSON payloads
  • personalized by region or jurisdiction
  • blocked when traffic does not look like a real browser session

So the reliable mental model is:

  • browser for discovery
  • JSON for extraction

That is also what makes the scraper maintainable. DOM classes change constantly; event payload structures usually change less often.


Install the dependencies

python3 -m venv .venv
source .venv/bin/activate
pip install playwright
playwright install chromium

We will use Playwright because it is good at observing network responses without forcing us to parse a huge rendered DOM.


Step 1: Capture the sportsbook JSON behind the page

The key move is to listen for network responses while the page loads.

from __future__ import annotations

import json
from pathlib import Path
from urllib.parse import urlparse

from playwright.sync_api import sync_playwright


TARGET_URL = "https://sportsbook.fanduel.com/navigation/nba"


def should_capture(url: str) -> bool:
    return "fanduel.com" in url and (
        "/cache/" in url
        or "content-managed-page" in url
        or "/api/" in url
    )


def capture_payloads(output_dir: str = "fanduel_payloads") -> list[dict]:
    Path(output_dir).mkdir(parents=True, exist_ok=True)
    captured = []

    with sync_playwright() as p:
        browser = p.chromium.launch(headless=True)
        page = browser.new_page(viewport={"width": 1440, "height": 1200})

        def handle_response(response):
            url = response.url
            if not should_capture(url):
                return

            try:
                content_type = response.headers.get("content-type", "")
                if "json" not in content_type:
                    return

                payload = response.json()
                name = urlparse(url).path.strip("/").replace("/", "_")[:120] + ".json"
                path = Path(output_dir) / name
                path.write_text(json.dumps(payload, indent=2), encoding="utf-8")
                captured.append({"url": url, "path": str(path)})
            except Exception:
                pass

        page.on("response", handle_response)
        page.goto(TARGET_URL, wait_until="domcontentloaded", timeout=60000)
        page.wait_for_timeout(8000)
        browser.close()

    return captured


if __name__ == "__main__":
    files = capture_payloads()
    print(f"captured {len(files)} JSON payloads")
    for item in files[:5]:
        print(item["url"])

This script does two things you want in production:

  • it keeps a local copy of the raw sportsbook payloads
  • it decouples later parsing work from live page access

That is a big deal when a site is geo-blocked or intermittently challenged.


Step 2: Convert raw price ratios into American odds

FanDuel-style payloads often carry raw price numerators and denominators instead of already formatted American lines.

def ratio_to_american(price_up: int, price_down: int) -> int | None:
    if not price_up or not price_down:
        return None

    if price_down < price_up:
        return int((price_up / price_down) * 100)
    return int((price_down / price_up) * -100)

Examples:

price_upprice_downAmerican odds
2023-115
2320+115
1011-110

This is the same idea you see when reverse engineering FanDuel payloads manually in DevTools.


Step 3: Normalize markets into dashboard rows

The payload shape changes over time, but a common structure is:

  • events
  • markets
  • selections

Here is a defensive parser that handles the common case without assuming every node exists.

from __future__ import annotations

import json
from pathlib import Path


def normalize_market_payload(path: str) -> list[dict]:
    payload = json.loads(Path(path).read_text(encoding="utf-8"))
    rows = []

    events = payload.get("events", [])
    for event in events:
        event_name = event.get("eventname") or (
            f"{event.get('participantname_away')} @ {event.get('participantname_home')}"
        )
        start_time = event.get("tsstart")
        sport = event.get("sportname")

        for market in event.get("markets", []):
            market_name = market.get("name")

            for selection in market.get("selections", []):
                rows.append(
                    {
                        "sport": sport,
                        "event_name": event_name,
                        "start_time": start_time,
                        "market_name": market_name,
                        "selection_name": selection.get("name"),
                        "handicap": selection.get("currenthandicap"),
                        "american_odds": ratio_to_american(
                            selection.get("currentpriceup"),
                            selection.get("currentpricedown"),
                        ),
                    }
                )

    return rows

If your captured payload is event-specific rather than page-wide, adapt the root walk to:

  • eventmarketgroups
  • markets
  • selections

The normalization principle is the same.


Step 4: Poll for line movement

Once you have the event or market endpoint, line tracking becomes a polling problem.

from __future__ import annotations

import csv
import time
from datetime import datetime, timezone

import requests


def snapshot_event_json(url: str) -> list[dict]:
    payload = requests.get(url, timeout=30).json()
    rows = []

    for group in payload.get("eventmarketgroups", []):
        for market in group.get("markets", []):
            for selection in market.get("selections", []):
                rows.append(
                    {
                        "captured_at": datetime.now(timezone.utc).isoformat(),
                        "event_name": market.get("eventname"),
                        "market_name": market.get("name"),
                        "selection_name": selection.get("name"),
                        "handicap": selection.get("currenthandicap"),
                        "american_odds": ratio_to_american(
                            selection.get("currentpriceup"),
                            selection.get("currentpricedown"),
                        ),
                    }
                )

    return rows


def poll_line_movement(event_url: str, out_csv: str, iterations: int = 10, sleep_seconds: int = 30) -> None:
    header_written = False

    for _ in range(iterations):
        rows = snapshot_event_json(event_url)
        with open(out_csv, "a", newline="", encoding="utf-8") as fh:
            writer = csv.DictWriter(fh, fieldnames=list(rows[0].keys()))
            if not header_written:
                writer.writeheader()
                header_written = True
            writer.writerows(rows)
        time.sleep(sleep_seconds)

That CSV becomes the raw input for:

  • movement charts
  • stale-line alerts
  • arbitrage comparisons
  • model backtesting

Practical anti-block notes

Sportsbooks are higher-friction targets than most tutorial sites, so be realistic:

IssueWhat it meansSafer response
403 / CloudFront pagerequest never reached usable app contentstop retry storms; rotate IP/session
empty or incomplete HTMLdata is hydrated by JScapture network JSON with a browser
geo-specific gapscertain markets differ by jurisdictionrecord the region and keep runs separate
inconsistent payloadssame sport page mixes featured and event-specific marketsnormalize everything into one row schema

If you are using ProxiesAPI in front of the browser or request layer, keep the integration explicit rather than magic:

  • store the original page URL
  • store the captured JSON URL
  • store the region / state context

That audit trail matters more than one extra field in the parser.


When to scrape the page and when to scrape the API

Use the rendered page when you need:

  • visual verification
  • screenshots
  • a way to discover the hidden endpoints

Use the JSON endpoint when you need:

  • repeatable data pulls
  • line history
  • lower compute cost
  • cleaner downstream schemas

The browser is your discovery tool. The JSON is your production feed.


Final thoughts

A good FanDuel scraper is not really a DOM scraper. It is a network-observation pipeline.

That mindset change makes the job easier:

  • discover payloads with Playwright
  • save raw JSON for replay
  • convert price ratios to American odds
  • poll event endpoints for movement over time

Once you do that, building the betting dashboard is the easy part.

Keep sportsbook collection resilient with ProxiesAPI

Sportsbook pages are dynamic, geo-sensitive, and often hostile to repetitive traffic. A ProxiesAPI-ready collection layer gives you a safer way to stabilize the fetch side while keeping your pricing parser reusable.

Related guides

Scrape Secondhand Fashion Listings from Vinted with Python (Search + Pagination + Normalized Output)
Build a practical Vinted scraper: fetch search pages, extract listing cards, follow pagination, normalize results, and export clean JSON/CSV. Includes a screenshot and a ProxiesAPI-ready fetch layer.
tutorial#python#vinted#web-scraping
CAPTCHA in Web Scraping: How to Detect It Early and Avoid Costly Retries
A practical playbook for spotting CAPTCHA and challenge pages early, classifying block severity, and preventing expensive retry loops in production scrapers.
tutorial#captcha web scraping#anti-bot#python
Scrape ESPN Team Schedules and Game Results with Python
Collect upcoming games, completed results, opponents, dates, networks, and home-away splits from ESPN team schedule pages using the serialized page data behind the HTML.
tutorial#python#espn#sports
Scrape Stack Overflow User Profiles and Badges with Python
Extract reputation, badge counts, top tags, and profile metadata from public Stack Overflow user pages into JSON/CSV with robust selectors and a ProxiesAPI-ready fetch layer.
tutorial#python#stack-overflow#web-scraping