Web Scraping VBA: Extract Data from Websites into Excel (Tables + Pagination)

If you searched for web scraping VBA, you’re probably in one of these situations:

  • Excel is your team’s reporting tool, and the “source” is a website.
  • You only need a small dataset (dozens to hundreds of rows).
  • IT policy blocks installing Python/Node on the machine that runs the report.

This guide is a pragmatic playbook for extracting website data into Excel with VBA:

  • fetch HTML with WinHTTP
  • parse DOM with MSHTML.HTMLDocument
  • extract tables into a worksheet
  • paginate safely
  • add retries/backoff so your macro doesn’t die on the first timeout
If VBA must scrape, keep the network layer stable with ProxiesAPI

Excel scrapers tend to fail in the fetch step first (timeouts, 403/429s, flaky responses). ProxiesAPI gives you a proxy-backed fetch URL you can route WinHTTP through so retries and rotation stay simple.


Reality check: what VBA scraping is (and isn’t) good for

VBA scraping is good when:

  • the pages are server-rendered (HTML contains the data)
  • you can tolerate slower performance
  • you’re scraping a small number of URLs

VBA scraping is a poor fit when:

  • the site is JS-rendered (data loads after page render)
  • you need concurrency and speed
  • you need a robust crawler that runs unattended

If you’re in that second bucket, stop fighting Excel and build the scraper in Python—then load the CSV into Excel.


Approach options (pick the right one)

ApproachBest forProsCons
Excel Power Querytables/feeds you can refresheasiest maintenancelimited for custom parsing
VBA + WinHTTP + MSHTMLcustom parsing in Excelflexible, deployablefragile under blocking/timeouts
Python scraper + CSV → Excelserious scrapingmost reliablerequires runtime/install

This post focuses on the middle option: VBA + WinHTTP + MSHTML.


Step 1: Enable required references

In the VBA editor:

  1. ToolsReferences
  2. Enable:
    • Microsoft WinHTTP Services, version 5.1
    • Microsoft HTML Object Library

If you can’t enable references (locked-down environment), you can still use late binding—but it’s more verbose.


Step 2: Fetch HTML with WinHTTP (with timeouts)

Option Explicit

Private Function HttpGet(ByVal url As String) As String
    Dim http As WinHttp.WinHttpRequest
    Set http = New WinHttp.WinHttpRequest

    http.SetTimeouts 10000, 10000, 10000, 30000 ' resolve, connect, send, receive
    http.Open "GET", url, False
    http.SetRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64)"
    http.SetRequestHeader "Accept-Language", "en-US,en;q=0.9"
    http.Send

    If http.Status < 200 Or http.Status >= 300 Then
        Err.Raise vbObjectError + 1000, "HttpGet", "HTTP " & http.Status & " for " & url
    End If

    HttpGet = http.ResponseText
End Function

Add retry + backoff (mandatory for reliability)

Private Function HttpGetWithRetry(ByVal url As String, Optional ByVal maxRetries As Long = 4) As String
    Dim attempt As Long

    For attempt = 1 To maxRetries
        On Error GoTo TryFail
        HttpGetWithRetry = HttpGet(url)
        Exit Function

TryFail:
        If attempt = maxRetries Then
            Err.Raise Err.Number, Err.Source, Err.Description
        End If
        Application.Wait Now + TimeSerial(0, 0, 2 ^ (attempt - 1))
    Next attempt
End Function

Step 3: Parse HTML with MSHTML and extract a table

Private Function ParseHtml(ByVal html As String) As MSHTML.HTMLDocument
    Dim doc As MSHTML.HTMLDocument
    Set doc = New MSHTML.HTMLDocument
    doc.body.innerHTML = html
    Set ParseHtml = doc
End Function

Now extract the first HTML table (common pattern for “prices”, “lists”, “directories”):

Private Sub WriteFirstTableToSheet(ByVal doc As MSHTML.HTMLDocument, ByVal ws As Worksheet)
    Dim tbl As MSHTML.HTMLTable
    Dim r As Long, c As Long
    Dim row As MSHTML.HTMLTableRow
    Dim cell As MSHTML.HTMLTableCell

    Set tbl = doc.querySelector("table")
    If tbl Is Nothing Then Err.Raise vbObjectError + 2000, "WriteFirstTableToSheet", "No <table> found"

    r = 1
    For Each row In tbl.Rows
        c = 1
        For Each cell In row.Cells
            ws.Cells(r, c).Value = Trim$(cell.innerText)
            c = c + 1
        Next cell
        r = r + 1
    Next row
End Sub

Step 4: Paginate (and stop when the page ends)

Many sites use a ?page=N query parameter. A safe paginator:

  • starts at page 1
  • stops when the HTML repeats (or no rows extracted)
  • caps pages to prevent runaway loops
Private Function BuildPageUrl(ByVal baseUrl As String, ByVal pageNum As Long) As String
    If InStr(1, baseUrl, "?", vbTextCompare) > 0 Then
        BuildPageUrl = baseUrl & "&page=" & CStr(pageNum)
    Else
        BuildPageUrl = baseUrl & "?page=" & CStr(pageNum)
    End If
End Function
Public Sub ScrapePagedTable()
    Dim baseUrl As String
    baseUrl = "https://example.com/list"

    Dim ws As Worksheet
    Set ws = ThisWorkbook.Worksheets("Sheet1")
    ws.Cells.Clear

    Dim pageNum As Long, outRow As Long
    Dim lastFirstCell As String
    outRow = 1

    For pageNum = 1 To 10
        Dim url As String
        url = BuildPageUrl(baseUrl, pageNum)

        Dim html As String
        html = HttpGetWithRetry(url)

        Dim doc As MSHTML.HTMLDocument
        Set doc = ParseHtml(html)

        Dim tbl As MSHTML.HTMLTable
        Set tbl = doc.querySelector("table")
        If tbl Is Nothing Then Exit For

        Dim firstCellText As String
        If tbl.Rows.Length = 0 Or tbl.Rows(0).Cells.Length = 0 Then Exit For
        firstCellText = Trim$(tbl.Rows(0).Cells(0).innerText)

        If firstCellText = lastFirstCell Then Exit For
        lastFirstCell = firstCellText

        ' Write table under the previous output
        ws.Cells(outRow, 1).Value = "Page " & pageNum
        outRow = outRow + 1

        ' Simple table write (you can adapt to skip headers)
        Dim row As MSHTML.HTMLTableRow, cell As MSHTML.HTMLTableCell
        Dim r As Long, c As Long
        r = outRow
        For Each row In tbl.Rows
            c = 1
            For Each cell In row.Cells
                ws.Cells(r, c).Value = Trim$(cell.innerText)
                c = c + 1
            Next cell
            r = r + 1
        Next row
        outRow = r + 1
    Next pageNum
End Sub

Proxies + VBA (how to keep it from getting blocked)

If your macro starts failing with 403/429 errors or intermittent timeouts:

  • slow down (add delays)
  • reduce page count per run
  • rotate IPs via a proxy-backed fetch layer

ProxiesAPI routing pattern (conceptual)

ProxiesAPI provides a fetch URL like:

http://api.proxiesapi.com/?auth_key=YOUR_KEY&url=https://target.com/page

In VBA, you’d request that ProxiesAPI URL instead of the target URL:

  • build the ProxiesAPI URL in a function
  • keep everything else the same (retries, parsing, pagination)

That way your scraper stays maintainable: fetch → parse → write.


Practical checklist for web scraping VBA

  • Keep HttpGet() isolated and testable.
  • Add retry + exponential backoff.
  • Parse with selectors (querySelector) instead of brittle string splits.
  • Cap pagination.
  • Write raw HTML to a sheet when debugging (you’ll thank yourself).

If you’re scraping more than a few hundred pages, graduate to Python—and keep Excel as the reporting layer.

If VBA must scrape, keep the network layer stable with ProxiesAPI

Excel scrapers tend to fail in the fetch step first (timeouts, 403/429s, flaky responses). ProxiesAPI gives you a proxy-backed fetch URL you can route WinHTTP through so retries and rotation stay simple.

Related guides

Web Scraping with VBA: Extract Website Data into Excel (with Proxies + Retry Logic)
A pragmatic VBA web scraping guide for Excel: HTTP requests, HTML parsing, pagination, retries, and how to route requests through a ProxiesAPI proxy when sites block you.
guide#vba#excel#web-scraping
Web Scraping Excel: Import Website Data into Spreadsheets (No-Code + Power Query + VBA)
A practical guide to getting website data into Excel: Power Query (HTML tables + pagination), Office Scripts for scheduled pulls, and VBA for legacy flows—plus when you still need proxies and a Python pipeline.
guides#excel#power-query#office-scripts
How to Scrape E-Commerce Websites: A Practical Guide
A step-by-step playbook for ecommerce scraping: product selectors, pagination, retries, proxy rotation, and data QA — with real Python patterns you can reuse.
guide#ecommerce scraping#python#web-scraping
Web Scraping with JavaScript and Node.js: Full Tutorial (2026)
An end-to-end Node.js scraping workflow: fetch pages with retries, parse HTML, handle pagination, rotate proxies with ProxiesAPI, and export clean JSON.
guide#javascript#nodejs#web-scraping