Scrape Marktplaats Seller Listings and Prices with Python

If you already know which seller you care about, scraping a seller page is often better than scraping broad search results.

A seller page gives you:

  • a tighter inventory list
  • a cleaner pricing view
  • listing URLs tied to one merchant or store
  • useful seller metadata in the page source

In this guide we'll scrape a real Marktplaats seller page and collect:

  • seller name
  • listing title
  • listing price
  • listing URL
  • city or location when present

Marktplaats seller page (we'll scrape seller inventory + prices)

Keep marketplace crawls reliable with ProxiesAPI

Marktplaats seller pages are workable over plain HTTP, but larger inventory crawls fail because of retries, throttling, and IP reputation. ProxiesAPI helps keep the fetch step predictable while your parser stays simple.


What we're scraping

Marktplaats seller pages usually look like this:

https://www.marktplaats.nl/u/fietshokje/11746360/

For this walkthrough we'll use a seller inventory page that exposes multiple bike listings in the HTML:

https://www.marktplaats.nl/u/fietshokje-groningen/11746360/q/fiets/

In the live response, you can verify useful fields are embedded directly in the HTML:

  • sellerName
  • priceInfo
  • vipUrl
  • sellerId

That means we can scrape this page without rendering a browser session just to get the first pass of inventory data.


Setup

python3 -m venv .venv
source .venv/bin/activate
pip install requests beautifulsoup4 lxml

We'll also use:

  • csv from the standard library for export
  • re to extract embedded values cleanly

Step 1: Fetch the seller page

import requests

SELLER_URL = "https://www.marktplaats.nl/u/fietshokje-groningen/11746360/q/fiets/"
TIMEOUT = (10, 30)

session = requests.Session()
session.headers.update(
    {
        "User-Agent": (
            "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
            "AppleWebKit/537.36 (KHTML, like Gecko) "
            "Chrome/126.0.0.0 Safari/537.36"
        ),
        "Accept-Language": "nl-NL,nl;q=0.9,en;q=0.8",
    }
)


def fetch_html(url: str) -> str:
    response = session.get(url, timeout=TIMEOUT)
    response.raise_for_status()
    return response.text


html = fetch_html(SELLER_URL)
print("downloaded", len(html), "chars")

Terminal sanity check:

curl -L -s "https://www.marktplaats.nl/u/fietshokje-groningen/11746360/q/fiets/" | head -n 6

You should see a title like:

<title>≥ FIETSHOKJE - Advertenties op Marktplaats</title>

Step 2: Choose the right extraction strategy

Marktplaats is a modern app, but the seller page already includes serialized listing data in the server response.

That gives you two options:

  1. scrape visible HTML cards
  2. parse the embedded listing data

For seller inventory, the second approach is usually cleaner. We can search the response for repeated listing objects containing:

  • itemId
  • title
  • priceInfo
  • sellerInformation
  • vipUrl

This is still web scraping. We are just reading structured data that the page already ships to the browser.


Step 3: Extract seller listings

import re
from urllib.parse import urljoin

BASE = "https://www.marktplaats.nl"


def cents_to_eur(price_cents: int | None) -> str | None:
    if price_cents is None:
        return None
    euros = price_cents / 100
    return f"EUR {euros:,.2f}"


LISTING_PATTERN = re.compile(
    r'"itemId":"(?P<item_id>[^"]+)"'
    r'.{0,600}?'
    r'"title":"(?P<title>[^"]+)"'
    r'.{0,1200}?'
    r'"priceInfo":\{"priceCents":(?P<price_cents>\d+),"priceType":"(?P<price_type>[^"]+)"'
    r'.{0,800}?'
    r'"cityName":"(?P<city>[^"]+)"'
    r'.{0,1200}?'
    r'"sellerName":"(?P<seller_name>[^"]+)"'
    r'.{0,1600}?'
    r'"vipUrl":"(?P<vip_url>[^"]+)"',
    re.DOTALL,
)


def parse_listings(html: str) -> list[dict]:
    rows = []
    seen = set()

    for match in LISTING_PATTERN.finditer(html):
        data = match.groupdict()
        url = urljoin(BASE, data["vip_url"])
        if url in seen:
            continue

        rows.append(
            {
                "item_id": data["item_id"],
                "seller_name": data["seller_name"],
                "title": data["title"],
                "price": cents_to_eur(int(data["price_cents"])),
                "price_type": data["price_type"],
                "city": data["city"],
                "url": url,
            }
        )
        seen.add(url)

    return rows


rows = parse_listings(html)
print("parsed", len(rows), "rows")
print(rows[:3])

Why regex is acceptable here

Normally I prefer parsing JSON rather than regexing HTML.

But on seller pages like this one, the listing payload is embedded as repeated serialized fragments inside a much larger document. For a tutorial, a bounded regex is a practical way to:

  • prove the fields exist
  • keep dependencies light
  • extract exactly the fields you care about

If you want a more production-grade parser, your next step is to locate the full serialized object and load it with json.loads.


Step 4: Export seller inventory to CSV

import csv


def export_csv(rows: list[dict], path: str) -> None:
    fieldnames = [
        "item_id",
        "seller_name",
        "title",
        "price",
        "price_type",
        "city",
        "url",
    ]
    with open(path, "w", newline="", encoding="utf-8") as f:
        writer = csv.DictWriter(f, fieldnames=fieldnames)
        writer.writeheader()
        writer.writerows(rows)


export_csv(rows, "marktplaats_seller_inventory.csv")
print("saved", len(rows), "rows to CSV")

You can now sort or filter by:

  • fixed price vs bid listings
  • city
  • seller branch name
  • title keywords

That is often enough for inventory monitoring, resale analysis, or competitor tracking.


Full script

import csv
import re
import requests
from urllib.parse import urljoin

SELLER_URL = "https://www.marktplaats.nl/u/fietshokje-groningen/11746360/q/fiets/"
BASE = "https://www.marktplaats.nl"
TIMEOUT = (10, 30)

session = requests.Session()
session.headers.update(
    {
        "User-Agent": (
            "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
            "AppleWebKit/537.36 (KHTML, like Gecko) "
            "Chrome/126.0.0.0 Safari/537.36"
        ),
        "Accept-Language": "nl-NL,nl;q=0.9,en;q=0.8",
    }
)

LISTING_PATTERN = re.compile(
    r'"itemId":"(?P<item_id>[^"]+)"'
    r'.{0,600}?'
    r'"title":"(?P<title>[^"]+)"'
    r'.{0,1200}?'
    r'"priceInfo":\{"priceCents":(?P<price_cents>\d+),"priceType":"(?P<price_type>[^"]+)"'
    r'.{0,800}?'
    r'"cityName":"(?P<city>[^"]+)"'
    r'.{0,1200}?'
    r'"sellerName":"(?P<seller_name>[^"]+)"'
    r'.{0,1600}?'
    r'"vipUrl":"(?P<vip_url>[^"]+)"',
    re.DOTALL,
)


def fetch_html(url):
    response = session.get(url, timeout=TIMEOUT)
    response.raise_for_status()
    return response.text


def cents_to_eur(price_cents):
    return f"EUR {price_cents / 100:,.2f}"


def parse_listings(html):
    rows = []
    seen = set()

    for match in LISTING_PATTERN.finditer(html):
        data = match.groupdict()
        url = urljoin(BASE, data["vip_url"])
        if url in seen:
            continue

        rows.append(
            {
                "item_id": data["item_id"],
                "seller_name": data["seller_name"],
                "title": data["title"],
                "price": cents_to_eur(int(data["price_cents"])),
                "price_type": data["price_type"],
                "city": data["city"],
                "url": url,
            }
        )
        seen.add(url)

    return rows


def export_csv(rows, path):
    fieldnames = [
        "item_id",
        "seller_name",
        "title",
        "price",
        "price_type",
        "city",
        "url",
    ]
    with open(path, "w", newline="", encoding="utf-8") as f:
        writer = csv.DictWriter(f, fieldnames=fieldnames)
        writer.writeheader()
        writer.writerows(rows)


def main():
    html = fetch_html(SELLER_URL)
    rows = parse_listings(html)
    export_csv(rows, "marktplaats_seller_inventory.csv")
    print(f"saved {len(rows)} rows")


if __name__ == "__main__":
    main()

Practical improvements

Once the base seller scraper works, the best upgrades are:

  • crawl multiple seller URLs from a seed file
  • keep item_id as your stable primary key
  • alert when a listing disappears or the price changes
  • store the raw HTML for debugging when parsing fails

You can also combine this with a search-page scraper:

  • search results discover new sellers
  • seller pages monitor the inventory you care about repeatedly

That division keeps the monitoring job much cheaper.


Where ProxiesAPI fits

Scraping one seller page once is easy.

Scraping hundreds of seller pages every day is not. That is when you start dealing with:

  • retries
  • inconsistent responses
  • network-level throttling
  • regional variance

ProxiesAPI is useful at that stage because it improves the fetch layer without forcing you to rewrite the parser. Your extraction logic stays the same, but the crawl is less fragile when you move from one URL to many.

That is the honest value proposition: not magic extraction, just a steadier network path for recurring scraping jobs.

Keep marketplace crawls reliable with ProxiesAPI

Marktplaats seller pages are workable over plain HTTP, but larger inventory crawls fail because of retries, throttling, and IP reputation. ProxiesAPI helps keep the fetch step predictable while your parser stays simple.

Related guides

Scrape Marktplaats Search Results (Listings) with Python + ProxiesAPI
Build a practical Marktplaats search scraper: fetch the real HTML, extract listing title/price/location/url, and export CSV. Includes a screenshot and a ProxiesAPI-based fetch layer to keep crawls stable.
tutorial#python#marktplaats#web-scraping
Scrape Marktplaats Listings with Python (Search + Pagination + CSV Export)
Extract listing title, price, location, and URL from Marktplaats search results with Python + BeautifulSoup. Includes pagination, CSV export, and a ProxiesAPI fetch wrapper for stability.
tutorial#python#marktplaats#web-scraping
Build a Job Board with Data from Indeed
Scrape Indeed job listings (title, company, location, salary, summary) with Python (requests + BeautifulSoup), then save a clean dataset you can render as a simple job board. Includes pagination + ProxiesAPI fetch.
tutorial#python#indeed#jobs
Scrape GitHub Repository Data
Collect GitHub repository metadata, stars, forks, topics, and README-linked context from the public HTML with Python. Includes defensive selectors, CSV export, and a screenshot.
tutorial#python#github#web-scraping