#json

48 guides

Scrape Book Reviews and Ratings from Goodreads with Python (JSON-LD + Top Reviews)

Learn how to scrape Goodreads book pages responsibly: extract rating, rating count, review count via JSON-LD, parse key metadata, and collect top review snippets. Includes screenshot and ProxiesAPI-ready request patterns.

Web Scraping with C# and HtmlAgilityPack: A Practical 2026 Tutorial

A from-scratch C# web scraping tutorial using HttpClient + HtmlAgilityPack: requests, parsing, pagination, and exporting to CSV/JSON. Includes reliability patterns and when to add a proxy layer like ProxiesAPI.

Scrape Currency Exchange Rates with Python (Daily FX Dataset) + ProxiesAPI

Build a daily FX-rates dataset: scrape a real rates table, validate values, write CSV/JSONL, and keep it running with retries. Includes a ProxiesAPI-ready network layer and a screenshot of the source page.

Scrape Goodreads Author Pages: Books, Series, Ratings (ProxiesAPI + Python)

Extract author profile data plus a clean list of books (title, URL, average rating, rating count) from Goodreads author pages. Includes real selectors, retries, and a screenshot.

Google News Scraping: Build a Custom News Aggregator

Build a lightweight Google News based aggregator: search by topic, extract headlines and publishers, dedupe, and export a daily feed. Includes selectors, retries, and a ProxiesAPI fetch option.

Web Scraping with Go (Colly Framework): Complete Guide

Learn web scraping in Go using Colly: selectors, concurrency, rate limits, retries, and exporting to JSON/CSV. Includes a practical ProxiesAPI integration pattern for more reliable crawling.

Web Scraping with TypeScript in 2026: Playwright + Cheerio End-to-End Guide

A practical TypeScript scraping pipeline: Playwright for rendering and navigation, Cheerio for fast parsing, plus retries/backoff, queue design, and export to JSON/CSV. Includes proxy-rotation hooks and honest notes on where ProxiesAPI belongs.

Scrape Product Reviews from Best Buy with Python (SKU + Ratings + Pagination)

A practical Best Buy reviews scraper in Python: extract SKU from a product URL, pull reviews from Best Buy’s UGC endpoint, normalize fields, paginate safely, and export JSON/CSV. Includes a target-page screenshot and an optional ProxiesAPI fetch layer.

How to Scrape Shopify Stores: Products, Prices, and Inventory (2026)

Practical Shopify scraping patterns: discover product JSON endpoints, paginate collections, extract variants + availability, and reduce blocks while staying ethical.

Scrape Live Stock Data from Yahoo Finance with Python (Quotes + Key Stats)

A resilient Yahoo Finance scraper in Python: fetch quote pages via ProxiesAPI, extract live-ish quote fields + key stats from embedded JSON, handle retries, and export to CSV.

Scrape Book Data from Goodreads with Python (List Pages + Pagination)

Scrape Goodreads list pages for title/author/rating/reviews with Python: fetch via ProxiesAPI, parse real HTML selectors, paginate safely, and export CSV/JSON.

Scrape BBC News Headlines and Article URLs with Python (Sections + Deduping)

Scrape BBC News section pages to collect headlines and article URLs with Python + BeautifulSoup. Includes a simple dedupe store (JSON), multiple sections, and a ProxiesAPI fetch wrapper for stability.

Web Scraping with Rust: reqwest + scraper Crate Tutorial

A modern Rust scraping starter: fetch pages with reqwest, parse HTML with the scraper crate, handle pagination, export JSON/CSV, and add proxy support (including ProxiesAPI via HTTP proxy env vars).

Scrape Podcast Data from Apple Podcasts (Charts + Show/Episode Metadata) with Python + ProxiesAPI

Build a clean dataset of Apple Podcasts charts → show pages → episode lists. Includes stable IDs, incremental updates, and a scraper-friendly request layer using ProxiesAPI.

Scrape Google Play Store App Data with Python (Ratings, Reviews, and Install Counts)

Extract Play Store app metadata and reviews by crawling app detail pages and review endpoints safely. Includes a ProxiesAPI-ready network layer and a repeatable crawl plan.

Scrape Expedia Flight and Hotel Data with Python (Step-by-Step)

A practical Expedia scraper in Python using Playwright: open search results, extract hotel cards (and where flight offers live), paginate safely, and export clean JSON/CSV. Includes ProxiesAPI-friendly network patterns and a screenshot.

How to Scrape Google Finance Data with Python (Quotes, News, and Historical Prices)

Scrape Google Finance quote pages for price, key stats, news headlines, and a simple historical price series with Python. Includes selector-first HTML parsing, CSV export, and block-avoidance tactics (timeouts, retries, and ProxiesAPI-friendly patterns).

Scrape Sports Scores from ESPN (Python + ProxiesAPI)

Fetch ESPN’s scoreboard page, parse games + teams + scores into a clean table, then export CSV/JSON. Includes a screenshot and a resilient parsing strategy.

Scrape Podcast Data from Apple Podcasts: Charts + Episode Metadata (Python + ProxiesAPI)

Scrape Apple Podcasts chart pages, extract show details, then pull episode metadata into a clean dataset. Includes screenshot + robust parsing with fallbacks.

Scrape Book Data from Goodreads (Titles, Authors, Ratings, and Reviews)

A practical Goodreads scraper in Python: collect book title/author/rating count/review count + key metadata using robust selectors, ProxiesAPI in the fetch layer, and export to JSON/CSV.

Scrape Wikipedia Article Data at Scale (Tables + Infobox + Links)

Extract structured fields from many Wikipedia pages (infobox + tables + links) with ProxiesAPI + Python, then save to CSV/JSON.

Scrape Weather Data for Any City (Open-Meteo)

Build a lightweight weather dataset pipeline: geocode a city, fetch forecasts from Open-Meteo, add caching + retries, and export clean JSON/CSV.

How to Scrape Business Reviews from Yelp (Python + ProxiesAPI)

Extract Yelp search results and business-page review snippets with Python. Includes pagination, resilient selectors, retries, and a clean JSON/CSV export.

How to Scrape Apartment Listings from Apartments.com (Python + ProxiesAPI)

Scrape Apartments.com listing cards and detail-page fields with Python. Includes pagination, resilient parsing, retries, and clean JSON/CSV exports.

Build a Job Board with Data from Indeed (Python scraper tutorial)

Scrape Indeed job listings (title, company, location, salary, summary) with Python (requests + BeautifulSoup), then save a clean dataset you can render as a simple job board. Includes pagination + ProxiesAPI fetch.

How to Scrape GitHub Releases with Python (Versions + Notes + Diffs)

Scrape a GitHub Releases page, extract versions and release notes, and store structured data so you can alert on changes.

Scrape Secondhand Fashion Listings from Vinted with Python (Search + Pagination + Normalized Output)

Build a practical Vinted scraper: fetch search pages, extract listing cards, follow pagination, normalize results, and export clean JSON/CSV. Includes a screenshot and a ProxiesAPI-ready fetch layer.

Scrape Sports Scores from ESPN with Python (Scoreboard API + Normalized CSV)

Build a reliable ESPN scores scraper: pull scoreboard data for multiple sports, normalize teams/scores/status, and export clean CSV/JSON. Includes a screenshot and a ProxiesAPI-ready fetch layer.

Scrape Numbeo City Cost-of-Living Comparisons (2-City Diff Tables) with Python

Extract Numbeo city-vs-city cost of living comparison rows into a clean dataset (item, city1, city2, percent diff). Includes screenshot, URL builder, and robust table parsing.

Scrape Vinted Listings with Python: Search + Pagination + Clean CSV Export

Build a practical Vinted listings scraper: pull search results via Vinted’s internal catalog endpoint, paginate safely, extract price/brand/size/image URLs, and export a clean CSV. Includes a screenshot + ProxiesAPI integration.

Scrape Stack Overflow with Python: Tag Pages + Question Threads + Q/A Export

Build a production-ready Stack Overflow scraper: crawl tag pages, follow question links, extract question + answers + votes, and export JSON/CSV. Includes a screenshot and ProxiesAPI integration hooks.

Scrape Shopee Reviews at Scale: Ratings, Review Text, and Product Metadata

Fetch Shopee product metadata + reviews via ProxiesAPI, paginate ratings safely, and export clean JSON/CSV for analysis. Includes robust URL parsing, retry/backoff, and a screenshot of a real product page.

Scrape Government Contract Data from SAM.gov (Opportunities + Details)

Build an end-to-end SAM.gov scraper: search opportunities, paginate results, fetch detail pages, normalize fields, and export JSON/CSV using ProxiesAPI. Includes screenshots + robust retry patterns.

How to Scrape Stack Overflow Questions and Accepted Answers with Python (By Tag)

Build a resilient Stack Overflow scraper: crawl tag pages, extract question metadata, follow links, and parse accepted answers. Includes retries, dedupe, and ProxiesAPI-ready requests + a screenshot of the tag page.

Scrape UK Property Prices from Rightmove (Dataset Builder + Screenshots)

Build a repeatable Rightmove sold-price dataset pipeline in Python: crawl result pages, extract listing URLs, parse sold-price details, and export clean CSV/JSON with retries and politeness.

Scrape Government Contract Data from SAM.gov (Opportunities + Details)

Build a SAM.gov opportunities dataset in Python: search with filters, paginate results, follow detail pages, and export structured contract fields with retries and polite crawling.

Scrape Stack Overflow Questions and Answers by Tag (Python + ProxiesAPI)

Collect Stack Overflow Q&A for a tag with pagination, answer extraction, and a proof screenshot. Export clean JSON for analysis.

Scrape IMDb Top 250 Movies into a Dataset (Python + ProxiesAPI)

Extract IMDb Top 250 movies (rank, title, year, rating, vote count) into clean CSV/JSON — with robust parsing, retries, and polite crawling.

Scrape Hacker News: Top Stories + Comments (Python + ProxiesAPI)

Scrape HN front pages and full comment threads into clean JSON — with pagination, robust selectors, retries, and an honest scaling path with ProxiesAPI.

Scrape Pinterest Images and Pins (Search + Board URLs) with Python + ProxiesAPI

Extract pin titles, image URLs, outbound links, and board metadata from Pinterest search + board pages with pagination, retries, and defensive parsing. Includes a screenshot of the target UI.

Scrape Netflix Catalogue Data with Python + ProxiesAPI (Titles, Genres, Availability)

Build a repeatable Netflix title dataset from listing pages: extract title rows, handle pagination defensively, dedupe, and export clean JSONL. Includes a screenshot of the target UI.

Scrape Numbeo Cost of Living Data with Python (cities, indices, and tables)

Extract Numbeo cost-of-living tables into a structured dataset (with a screenshot), then export to JSON/CSV using ProxiesAPI-backed requests.

Scrape Stack Overflow Questions and Answers by Tag (Python + ProxiesAPI)

Extract Stack Overflow question lists and accepted answers for a tag with robust retries, respectful rate limits, and a validation screenshot. Export to JSON/CSV.

Scrape Patreon Creator Data with Python (Profiles, Tiers, Posts)

Extract Patreon creator metadata, membership tiers, and recent public posts with a screenshot-first workflow, robust retries, and ProxiesAPI-backed requests.

Scrape Reddit Forum Data with Python: Posts, Comments, and Pagination

Scrape subreddit listing pages and comment threads with Python (requests + BeautifulSoup) using the old.reddit.com HTML, plus safe pagination, retry/backoff, and ProxiesAPI-friendly request patterns. Includes a screenshot.

Scrape NBA Scores and Standings from ESPN with Python (Box Scores + Schedule)

Build a clean dataset of today’s NBA games and standings from ESPN pages using robust selectors and proxy-safe requests.

Scrape IMDb Top 250 Movies into a Dataset

Pull rank, title, year, rating, and votes into clean CSV/JSON for analysis with working Python code.

How to Scrape Hacker News (HN) with Python: Stories + Pagination + Comments

A production-grade Hacker News scraper: parse the real HTML, crawl multiple pages, extract stories and comment threads, and export clean JSON. Includes terminal-style runs and selector rationale.