Web Scraping VBA: Extract Data from Websites into Excel

Jun 28, 2026 · guide · #vba, #excel, #web scraping vba, #web-scraping, #http, #html, #tables, #pagination

If you searched for web scraping VBA, you are probably not trying to build a giant distributed crawler. You are trying to get website data into Excel because that is where the business process already lives.

That is a valid use case. The mistake is pretending VBA scraping works the same way as a modern Python scraper. It does not. VBA is slower, less ergonomic, and easier to break. But for small, server-rendered targets it can still be perfectly useful.

This guide focuses on the workflows that actually hold up:

fetch HTML with WinHTTP
parse DOM with MSHTML.HTMLDocument
extract tables or repeated cards into a worksheet
paginate safely
add retries and a stable request layer

Keep Excel scrapers alive when the network layer gets messy

VBA projects usually fail in the fetch step first: timeouts, 403s, 429s, and inconsistent responses. ProxiesAPI lets you add a proxy-backed request URL without rewriting the rest of your Excel macro.

Get 1,000 free API calls View pricing

When VBA is the right tool

Use VBA when:

Excel is the final destination anyway
the site is mostly server-rendered HTML
the job is small enough to run serially
installing Python or Node is difficult in your environment

Do not use VBA when:

the data appears only after heavy client-side JavaScript
you need concurrency or high throughput
the scraper must run unattended across thousands of URLs

If you are in that second bucket, build the scraper elsewhere and import the output into Excel.

Your three main options

Approach	Best for	Good parts	Tradeoffs
Power Query	clean tables and feeds	easiest to maintain	limited custom parsing
VBA + WinHTTP + MSHTML	custom HTML extraction in Excel	flexible, deployable	brittle under retries, blocks, and markup drift
Python scraper -> CSV -> Excel	larger scraping jobs	most reliable	extra runtime and tooling

This post is about the middle option.

Step 1: Enable the references

In the VBA editor:

Open Tools -> References
Enable:
- Microsoft WinHTTP Services, version 5.1
- Microsoft HTML Object Library

If your environment blocks references, you can use late binding, but early binding is easier to debug.

Step 2: Fetch HTML with WinHTTP

Option Explicit

Private Function HttpGet(ByVal targetUrl As String) As String
    Dim http As WinHttp.WinHttpRequest
    Set http = New WinHttp.WinHttpRequest

    http.SetTimeouts 10000, 10000, 10000, 30000
    http.Open "GET", targetUrl, False
    http.SetRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64)"
    http.SetRequestHeader "Accept-Language", "en-US,en;q=0.9"
    http.Send

    If http.Status < 200 Or http.Status >= 300 Then
        Err.Raise vbObjectError + 1000, "HttpGet", "HTTP " & http.Status & " for " & targetUrl
    End If

    HttpGet = http.ResponseText
End Function

This function does exactly one thing: fetch HTML. Do not mix parsing and worksheet writes into the same routine unless you enjoy debugging giant macros.

Step 3: Add retry and backoff

Transient failures are normal. Treat them as part of the job.

Private Function HttpGetWithRetry(ByVal targetUrl As String, Optional ByVal maxRetries As Long = 4) As String
    Dim attempt As Long

    For attempt = 1 To maxRetries
        On Error GoTo TryAgain
        HttpGetWithRetry = HttpGet(targetUrl)
        Exit Function

TryAgain:
        If attempt = maxRetries Then
            Err.Raise Err.Number, Err.Source, Err.Description
        End If

        Application.Wait Now + TimeSerial(0, 0, 2 ^ (attempt - 1))
        Err.Clear
    Next attempt
End Function

If you skip retries, your macro will look "randomly broken" even though the problem is just normal network noise.

Step 4: Parse the response into an HTML document

Private Function ParseHtml(ByVal html As String) As MSHTML.HTMLDocument
    Dim doc As MSHTML.HTMLDocument
    Set doc = New MSHTML.HTMLDocument
    doc.body.innerHTML = html
    Set ParseHtml = doc
End Function

From here, use CSS selectors with querySelector and querySelectorAll instead of fragile string slicing.

Step 5: Extract a table into Excel

This is the classic use case for web scraping VBA.

Private Sub WriteFirstTableToSheet(ByVal doc As MSHTML.HTMLDocument, ByVal ws As Worksheet)
    Dim tbl As MSHTML.HTMLTable
    Dim rowNode As MSHTML.HTMLTableRow
    Dim cellNode As MSHTML.HTMLTableCell
    Dim rowIndex As Long
    Dim colIndex As Long

    Set tbl = doc.querySelector("table")
    If tbl Is Nothing Then
        Err.Raise vbObjectError + 2000, "WriteFirstTableToSheet", "No table found"
    End If

    rowIndex = 1
    For Each rowNode In tbl.Rows
        colIndex = 1
        For Each cellNode In rowNode.Cells
            ws.Cells(rowIndex, colIndex).Value = Trim$(cellNode.innerText)
            colIndex = colIndex + 1
        Next cellNode
        rowIndex = rowIndex + 1
    Next rowNode
End Sub

If the site has more than one table, switch the selector to something more specific, such as table.pricing or div.results table.

Step 6: Handle repeated card layouts

Many sites are not table-based. They render a list of cards instead. VBA can still handle that if the HTML is server-rendered.

Private Sub WriteCardsToSheet(ByVal doc As MSHTML.HTMLDocument, ByVal ws As Worksheet)
    Dim cards As IHTMLDOMChildrenCollection
    Dim card As IHTMLElement
    Dim rowIndex As Long

    Set cards = doc.querySelectorAll("div.product-card")
    rowIndex = 2

    ws.Cells(1, 1).Value = "Title"
    ws.Cells(1, 2).Value = "Price"
    ws.Cells(1, 3).Value = "Link"

    For Each card In cards
        ws.Cells(rowIndex, 1).Value = card.querySelector("h2").innerText
        ws.Cells(rowIndex, 2).Value = card.querySelector(".price").innerText
        ws.Cells(rowIndex, 3).Value = card.querySelector("a").getAttribute("href")
        rowIndex = rowIndex + 1
    Next card
End Sub

The key habit is to inspect the site once, pick narrow selectors, and keep those selectors together in one parsing routine.

Step 7: Paginate without running forever

Private Function BuildPageUrl(ByVal baseUrl As String, ByVal pageNum As Long) As String
    If InStr(1, baseUrl, "?", vbTextCompare) > 0 Then
        BuildPageUrl = baseUrl & "&page=" & CStr(pageNum)
    Else
        BuildPageUrl = baseUrl & "?page=" & CStr(pageNum)
    End If
End Function

Public Sub ScrapePages()
    Dim baseUrl As String
    Dim ws As Worksheet
    Dim pageNum As Long
    Dim nextRow As Long

    baseUrl = "https://example.com/listings"
    Set ws = ThisWorkbook.Worksheets("Sheet1")
    ws.Cells.Clear

    nextRow = 1

    For pageNum = 1 To 5
        Dim html As String
        Dim doc As MSHTML.HTMLDocument

        html = HttpGetWithRetry(BuildPageUrl(baseUrl, pageNum))
        Set doc = ParseHtml(html)

        If doc.querySelectorAll("table").Length = 0 Then Exit For

        ws.Cells(nextRow, 1).Value = "Page " & pageNum
        nextRow = nextRow + 1

        Call WriteFirstTableToSheet(doc, ws)
        nextRow = ws.Cells(ws.Rows.Count, 1).End(xlUp).Row + 2

        Application.Wait Now + TimeSerial(0, 0, 1)
    Next pageNum
End Sub

Add a hard page cap. A runaway loop in VBA is more annoying than in Python because people usually notice it only after Excel freezes.

Where ProxiesAPI fits

If the site starts returning 403s, 429s, or inconsistent HTML, change the fetch URL rather than rebuilding the macro.

Private Function WrapWithProxiesApi(ByVal targetUrl As String, ByVal apiKey As String) As String
    WrapWithProxiesApi = "https://api.proxiesapi.com/?auth_key=" & apiKey & "&url=" & targetUrl
End Function

Then call:

html = HttpGetWithRetry(WrapWithProxiesApi(BuildPageUrl(baseUrl, pageNum), "YOUR_KEY"))

That keeps your macro architecture the same:

build URL
fetch HTML
parse DOM
write cells

The proxy layer changes. The parser does not.

Practical advice

Start with one page and one selector before adding loops.
Save the raw HTML of a failing page when debugging.
Do not scrape JavaScript-heavy apps with VBA unless you enjoy false starts.
Keep worksheet writes separate from parsing logic so you can test both independently.

VBA scraping is not elegant, but it does not need to be. If the job is small, the selectors are stable, and Excel is the final destination, it is still a useful tool in 2026.

Keep Excel scrapers alive when the network layer gets messy

VBA projects usually fail in the fetch step first: timeouts, 403s, 429s, and inconsistent responses. ProxiesAPI lets you add a proxy-backed request URL without rewriting the rest of your Excel macro.

Get 1,000 free API calls View pricing

A pragmatic VBA web scraping guide for Excel: HTTP requests, HTML parsing, pagination, retries, and how to route requests through a ProxiesAPI proxy when sites block you.

guide#vba#excel#web-scraping

Web Scraping Excel: Import Website Data into Spreadsheets (No-Code + Power Query + VBA)

A practical guide to getting website data into Excel: Power Query (HTML tables + pagination), Office Scripts for scheduled pulls, and VBA for legacy flows—plus when you still need proxies and a Python pipeline.

guides#excel#power-query#office-scripts

Web Scraping with Rust: reqwest + scraper Crate Tutorial

A practical Rust scraping guide: fetch pages with reqwest, rotate proxies, parse HTML with the scraper crate, handle retries/timeouts, and export structured data.

guide#rust#web-scraping#reqwest

Web Scraping with Ruby: Nokogiri + HTTParty Tutorial

Walk through a production-friendly Ruby scraper with retries, parsing, pagination, and proxy support using Nokogiri and HTTParty.

guide#ruby#nokogiri#httparty

Web Scraping VBA: Extract Data from Websites into Excel

Related guides