Scrape Google Maps Business Listings with Python: Search → Place Details → Reviews (ProxiesAPI)

Google Maps is the internet’s biggest business directory.

If you do outbound (agencies, local SEO, B2B services), a “maps leads” dataset is gold:

  • business name
  • category
  • rating + review count
  • phone
  • website
  • address
  • opening hours
  • a few recent reviews

In this tutorial, we’ll build a practical pipeline for extracting that data:

  1. Search (find places for a query like “dentist in Austin TX”)
  2. Place details (open each listing and capture structured fields)
  3. Reviews (optional: fetch N most recent reviews)

We’ll do it in a way that’s honest about the tradeoffs:

  • Some data is loaded dynamically.
  • Google changes markup.
  • You may hit consent pages and bot challenges.

So we’ll use a 2-layer approach:

  • Playwright to drive the UI and capture the canonical place URL / ID (plus screenshots)
  • Python + Requests for the repeatable extraction layer (with retries + ProxiesAPI)

Google Maps results screen (we’ll screenshot the search results + a place panel)

Make lead scraping more reliable with ProxiesAPI

Google surfaces bot checks quickly. ProxiesAPI helps you rotate IPs and stabilize requests so your lead pipeline doesn’t die mid-run.


What we’re scraping (and why it’s tricky)

Google Maps is mostly a client app. A lot of the data you see in the left panel (or place sheet) is populated via internal APIs.

That means you have three options:

  1. UI automation (Playwright/Selenium): slower but closest to what a user sees
  2. Network reverse engineering: fast but brittle and can break without warning
  3. Official Places API: stable but paid + terms

This guide focuses on UI automation + lightweight parsing. It’s the most teachable and tends to survive small changes.


Setup

python -m venv .venv
source .venv/bin/activate
pip install playwright requests python-dotenv
playwright install

Create .env:

PROXIESAPI_KEY="YOUR_KEY"
PROXIESAPI_PROXY_URL="http://user:pass@gateway:port"  # if ProxiesAPI provides a proxy URL

Step 1: Resilient HTTP fetch (with optional ProxiesAPI)

Even if you drive the UI with Playwright, you’ll still want a fetch helper for:

  • grabbing the canonical place page HTML
  • pulling a lightweight JSON endpoint (if available)
  • downloading your own exported artifacts

Here’s a reusable fetch().

import os
import random
import time
from typing import Optional

import requests
from dotenv import load_dotenv

load_dotenv()

PROXY_URL = os.getenv("PROXIESAPI_PROXY_URL")
TIMEOUT = (10, 30)


def make_session() -> requests.Session:
    s = requests.Session()
    s.headers.update({
        "User-Agent": (
            "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
            "AppleWebKit/537.36 (KHTML, like Gecko) "
            "Chrome/123.0.0.0 Safari/537.36"
        ),
        "Accept-Language": "en-US,en;q=0.9",
    })
    return s


def fetch(url: str, session: requests.Session, max_attempts: int = 4) -> str:
    last_exc: Optional[Exception] = None

    for attempt in range(1, max_attempts + 1):
        time.sleep(random.uniform(0.8, 2.0))

        try:
            proxies = None
            if PROXY_URL:
                proxies = {"http": PROXY_URL, "https": PROXY_URL}

            r = session.get(url, timeout=TIMEOUT, proxies=proxies, allow_redirects=True)

            if r.status_code in (403, 429, 500, 502, 503, 504):
                time.sleep(min(10, 1.5 ** attempt) + random.uniform(0, 0.7))
                continue

            r.raise_for_status()
            return r.text

        except Exception as e:
            last_exc = e
            time.sleep(min(10, 1.5 ** attempt) + random.uniform(0, 0.7))

    raise RuntimeError(f"Failed to fetch {url} after {max_attempts} attempts") from last_exc

We’ll open Google Maps, run a search, then scroll results and collect place links.

Search URL format (often works):

  • https://www.google.com/maps/search/<QUERY>

Example:

  • https://www.google.com/maps/search/dentists+in+austin+tx

Playwright collector

import json
import re
from urllib.parse import quote_plus

from playwright.sync_api import sync_playwright


def maps_search_url(query: str) -> str:
    return f"https://www.google.com/maps/search/{quote_plus(query)}"


def collect_place_urls(query: str, max_places: int = 30) -> list[str]:
    urls: list[str] = []
    seen = set()

    with sync_playwright() as p:
        browser = p.chromium.launch(headless=True)
        page = browser.new_page(viewport={"width": 1280, "height": 720})

        page.goto(maps_search_url(query), wait_until="domcontentloaded")

        # Give the app a moment to hydrate
        page.wait_for_timeout(2500)

        # Scroll the results panel (role/aria labels change; selectors are best verified with a screenshot)
        for _ in range(25):
            # Collect links that look like /maps/place/...
            anchors = page.locator('a[href^="https://www.google.com/maps/place"]')
            for i in range(anchors.count()):
                href = anchors.nth(i).get_attribute("href")
                if not href:
                    continue
                if href in seen:
                    continue
                seen.add(href)
                urls.append(href)
                if len(urls) >= max_places:
                    break
            if len(urls) >= max_places:
                break

            # wheel scroll
            page.mouse.wheel(0, 1800)
            page.wait_for_timeout(1200)

        browser.close()

    return urls


if __name__ == "__main__":
    q = "dentists in austin tx"
    urls = collect_place_urls(q, max_places=20)
    print("collected", len(urls))
    print(urls[:3])
    with open("place_urls.json", "w", encoding="utf-8") as f:
        json.dump(urls, f, indent=2)

Why screenshots matter here

Google Maps’ DOM is complex and changes. The best workflow is:

  1. open the page manually
  2. take a screenshot of the results panel and a place panel
  3. verify the anchor patterns you’re extracting

We’ll do that in the screenshot step later.


Step 3: Extract place details

There are two strategies:

  • parse fields directly from the DOM in Playwright (often reliable)
  • fetch the canonical place URL HTML and parse (sometimes returns simplified markup)

We’ll use the UI DOM because it’s closest to what users see.

Playwright place detail parser

from dataclasses import dataclass, asdict


@dataclass
class Place:
    name: str | None
    rating: str | None
    reviews: str | None
    category: str | None
    address: str | None
    phone: str | None
    website: str | None
    url: str


def parse_place_panel(page) -> Place:
    # NOTE: These selectors are heuristics. Always verify with a screenshot on your target query.
    name = page.locator("h1").first.text_content()

    rating = None
    rating_el = page.locator('[role="img"][aria-label*="stars"]').first
    if rating_el.count() > 0:
        rating = rating_el.get_attribute("aria-label")

    # Common: category and address appear as buttons in the panel
    category = page.locator('button[jsaction*="pane.rating.category"]').first.text_content()

    address = None
    addr = page.locator('button[data-item-id="address"]').first
    if addr.count() > 0:
        address = addr.text_content()

    phone = None
    ph = page.locator('button[data-item-id="phone"]').first
    if ph.count() > 0:
        phone = ph.text_content()

    website = None
    web = page.locator('a[data-item-id="authority"]').first
    if web.count() > 0:
        website = web.get_attribute("href")

    # review count sometimes appears near rating
    reviews = page.locator('button[jsaction*="pane.rating.moreReviews"]').first.text_content()

    return Place(
        name=(name or "").strip() if name else None,
        rating=(rating or "").strip() if rating else None,
        reviews=(reviews or "").strip() if reviews else None,
        category=(category or "").strip() if category else None,
        address=(address or "").strip() if address else None,
        phone=(phone or "").strip() if phone else None,
        website=(website or "").strip() if website else None,
        url=page.url,
    )

Putting it together: search → open each place → parse

import csv
import time
from playwright.sync_api import sync_playwright


def scrape_places_from_query(query: str, max_places: int = 25) -> list[dict]:
    place_urls = collect_place_urls(query, max_places=max_places)

    out: list[dict] = []

    with sync_playwright() as p:
        browser = p.chromium.launch(headless=True)
        page = browser.new_page(viewport={"width": 1280, "height": 720})

        for url in place_urls:
            page.goto(url, wait_until="domcontentloaded")
            page.wait_for_timeout(2000)

            try:
                place = parse_place_panel(page)
                out.append(asdict(place))
                print("parsed", place.name)
            except Exception as e:
                print("failed", url, e)

            time.sleep(1.0)

        browser.close()

    return out


def write_csv(rows: list[dict], path: str = "maps_leads.csv") -> None:
    if not rows:
        return
    with open(path, "w", newline="", encoding="utf-8") as f:
        w = csv.DictWriter(f, fieldnames=list(rows[0].keys()))
        w.writeheader()
        w.writerows(rows)


if __name__ == "__main__":
    rows = scrape_places_from_query("dentists in austin tx", max_places=20)
    write_csv(rows)
    print("wrote", len(rows))

Step 4 (Optional): Capture a few reviews

Reviews are loaded in a scrollable dialog.

If you want “last 10 reviews” for enrichment, you can:

  • click the reviews button
  • scroll the dialog
  • extract name/date/text

Because selectors vary, consider this a template:


def collect_reviews(page, max_reviews: int = 20) -> list[dict]:
    out = []

    # open reviews
    btn = page.locator('button[jsaction*="pane.rating.moreReviews"]').first
    if btn.count() == 0:
        return out
    btn.click()
    page.wait_for_timeout(1500)

    # scroll and collect review blocks
    for _ in range(15):
        blocks = page.locator('div[role="article"]')
        for i in range(blocks.count()):
            b = blocks.nth(i)
            text = b.text_content() or ""
            text = " ".join(text.split())
            if len(text) < 30:
                continue
            out.append({"text": text})
            if len(out) >= max_reviews:
                return out

        page.mouse.wheel(0, 1400)
        page.wait_for_timeout(900)

    return out

Where ProxiesAPI helps (realistically)

Google Maps is aggressive about bot detection.

If you scale this beyond a handful of queries, you’ll want:

  • IP rotation
  • retries
  • a way to recover from intermittent blocks

That’s where ProxiesAPI fits: not as a magic bypass, but as a reliability layer when your pipeline is doing many requests.


QA checklist

  • For a single query, you can collect at least 10 place URLs
  • Opening place URLs yields a place panel (not consent/bot interstitial)
  • Parsed rows have name + address for at least 70% of places
  • Your crawler uses pacing (no tight loops)
  • You saved screenshots for selector verification

Next upgrades

  1. Add dedupe: same place can appear across searches
  2. Store to SQLite and enrich incrementally
  3. Introduce job queueing (one city/query per worker)
  4. Consider the official API if you need stable long-term production use
Make lead scraping more reliable with ProxiesAPI

Google surfaces bot checks quickly. ProxiesAPI helps you rotate IPs and stabilize requests so your lead pipeline doesn’t die mid-run.

Related guides

Scrape Stock Prices and Financial Data with Python (Yahoo Finance) + ProxiesAPI
Build a daily stock-price dataset from Yahoo Finance: quote pages → parsed fields → CSV/SQLite, with retries, proxy rotation, and polite pacing.
tutorial#python#yahoo-finance#stocks
Scrape Book Data from Goodreads (Titles, Authors, Ratings, and Reviews)
A practical Goodreads scraper in Python: collect book title/author/rating count/review count + key metadata using robust selectors, ProxiesAPI in the fetch layer, and export to JSON/CSV.
tutorial#python#goodreads#books
Scrape Restaurant Data from TripAdvisor (Reviews, Ratings, and Locations)
Build a practical TripAdvisor scraper in Python: discover restaurant listing URLs, extract name/rating/review count/address, and export clean CSV/JSON with ProxiesAPI in the fetch layer.
tutorial#python#web-scraping#beautifulsoup
How to Scrape Cars.com Used Car Prices (Python + ProxiesAPI)
Extract listing title, price, mileage, location, and dealer info from Cars.com search results + detail pages. Includes selector notes, pagination, and a polite crawl plan.
tutorial#python#cars.com#price-scraping