#json
25 guides
Scrape SAM.gov Contract Opportunities with Python (Search API + Dataset + Screenshots)
Build a production-grade SAM.gov contract opportunities dataset builder: query the official Opportunities API, paginate results, normalize fields, and export JSONL/CSV. Includes proof screenshots, retry logic, and where ProxiesAPI fits when you also crawl linked docs.
Web Scraping with Go (Colly Framework): Complete Guide
Learn web scraping in Go using Colly: selectors, concurrency, rate limits, retries, and exporting to JSON/CSV. Includes a practical ProxiesAPI integration pattern for more reliable crawling.
Scrape Expedia Flight and Hotel Data with Python (Step-by-Step)
A practical Expedia scraper in Python using Playwright: open search results, extract hotel cards (and where flight offers live), paginate safely, and export clean JSON/CSV. Includes ProxiesAPI-friendly network patterns and a screenshot.
How to Scrape Google Finance Data with Python (Quotes, News, and Historical Prices)
Scrape Google Finance quote pages for price, key stats, news headlines, and a simple historical price series with Python. Includes selector-first HTML parsing, CSV export, and block-avoidance tactics (timeouts, retries, and ProxiesAPI-friendly patterns).
Scrape Sports Scores from ESPN (Python + ProxiesAPI)
Fetch ESPN’s scoreboard page, parse games + teams + scores into a clean table, then export CSV/JSON. Includes a screenshot and a resilient parsing strategy.
Scrape Podcast Data from Apple Podcasts: Charts + Episode Metadata (Python + ProxiesAPI)
Scrape Apple Podcasts chart pages, extract show details, then pull episode metadata into a clean dataset. Includes screenshot + robust parsing with fallbacks.
Web Scraping with C# and HtmlAgilityPack: A Practical 2026 Tutorial
A from-scratch C# web scraping tutorial using HttpClient + HtmlAgilityPack: requests, parsing, pagination, and exporting to CSV/JSON. Includes reliability patterns and when to add a proxy layer like ProxiesAPI.
Scrape Currency Exchange Rates (USD/EUR/INR) into a daily dataset with Python + ProxiesAPI
Build a small daily FX dataset pipeline: fetch exchange rates, validate values, write CSV/JSON, and keep it running with retries. Includes a ProxiesAPI-ready network layer.
How to Scrape Shopify Stores: Products, Prices, and Inventory (2026)
Practical Shopify scraping patterns: discover product JSON endpoints, paginate collections, extract variants + availability, and reduce blocks while staying ethical.
Scrape Wikipedia Article Data at Scale (Tables + Infobox + Links)
Extract structured fields from many Wikipedia pages (infobox + tables + links) with ProxiesAPI + Python, then save to CSV/JSON.
Scrape Weather Data for Any City (Open-Meteo)
Build a lightweight weather dataset pipeline: geocode a city, fetch forecasts from Open-Meteo, add caching + retries, and export clean JSON/CSV.
How to Scrape Business Reviews from Yelp (Python + ProxiesAPI)
Extract Yelp search results and business-page review snippets with Python. Includes pagination, resilient selectors, retries, and a clean JSON/CSV export.
How to Scrape Apartment Listings from Apartments.com (Python + ProxiesAPI)
Scrape Apartments.com listing cards and detail-page fields with Python. Includes pagination, resilient parsing, retries, and clean JSON/CSV exports.
Build a Job Board with Data from Indeed (Python scraper tutorial)
Scrape Indeed job listings (title, company, location, salary, summary) with Python (requests + BeautifulSoup), then save a clean dataset you can render as a simple job board. Includes pagination + ProxiesAPI fetch.
How to Scrape GitHub Releases with Python (Versions + Notes + Diffs)
Scrape a GitHub Releases page, extract versions and release notes, and store structured data so you can alert on changes.
Scrape Pinterest Images and Pins (Search + Board URLs) with Python + ProxiesAPI
Extract pin titles, image URLs, outbound links, and board metadata from Pinterest search + board pages with pagination, retries, and defensive parsing. Includes a screenshot of the target UI.
Scrape Netflix Catalogue Data with Python + ProxiesAPI (Titles, Genres, Availability)
Build a repeatable Netflix title dataset from listing pages: extract title rows, handle pagination defensively, dedupe, and export clean JSONL. Includes a screenshot of the target UI.
Scrape Numbeo Cost of Living Data with Python (cities, indices, and tables)
Extract Numbeo cost-of-living tables into a structured dataset (with a screenshot), then export to JSON/CSV using ProxiesAPI-backed requests.
Scrape Stack Overflow Questions and Answers by Tag (Python + ProxiesAPI)
Extract Stack Overflow question lists and accepted answers for a tag with robust retries, respectful rate limits, and a validation screenshot. Export to JSON/CSV.
Scrape Patreon Creator Data with Python (Profiles, Tiers, Posts)
Extract Patreon creator metadata, membership tiers, and recent public posts with a screenshot-first workflow, robust retries, and ProxiesAPI-backed requests.
Scrape Reddit Forum Data with Python: Posts, Comments, and Pagination
Scrape subreddit listing pages and comment threads with Python (requests + BeautifulSoup) using the old.reddit.com HTML, plus safe pagination, retry/backoff, and ProxiesAPI-friendly request patterns. Includes a screenshot.
Scrape NBA Scores and Standings from ESPN with Python (Box Scores + Schedule)
Build a clean dataset of today’s NBA games and standings from ESPN pages using robust selectors and proxy-safe requests.
Scrape Book Data from Goodreads (Titles, Authors, Ratings, and Reviews)
A practical Goodreads scraper in Python: collect book title/author/rating count/review count + key metadata using robust selectors, ProxiesAPI in the fetch layer, and export to JSON/CSV.
Scrape IMDb Top 250 Movies into a Dataset
Pull rank, title, year, rating, and votes into clean CSV/JSON for analysis with working Python code.
How to Scrape Hacker News (HN) with Python: Stories + Pagination + Comments
A production-grade Hacker News scraper: parse the real HTML, crawl multiple pages, extract stories and comment threads, and export clean JSON. Includes terminal-style runs and selector rationale.