Web Scraping with VBA: Extract Website Data into Excel (with Proxies + Retry Logic)

If you’ve ever tried web scraping in VBA, you already know the vibe:

  • It works… until it doesn’t.
  • One random timeout, and your macro is stuck.
  • One “Access denied” page, and your HTML parser happily extracts nonsense.

Still, scraping into Excel can be useful for:

  • pulling a daily price list
  • collecting a small list of URLs
  • building an internal report where Excel is the “source of truth”

This guide focuses on practical VBA scraping:

  • making HTTP requests from Excel (WinHTTP)
  • parsing HTML (MSHTML)
  • handling pagination
  • adding retry + backoff
  • routing requests via a ProxiesAPI proxy when sites rate-limit you

Target keyword: web scraping vba.

Excel scrapers break first — ProxiesAPI helps them break less

VBA is fragile under blocking and timeouts. If you must scrape from Excel, ProxiesAPI gives you a stable proxy endpoint you can route WinHTTP through to reduce 403/429 pain.


Reality check: when VBA is the wrong tool

VBA scraping is fine when:

  • you need < 100 pages/day
  • the site is simple HTML (no heavy JS rendering)
  • you don’t need concurrency

Switch to Python (or a dedicated scraper) when:

  • you need to scale (> 500 pages)
  • you hit bot protection (Cloudflare, PerimeterX, etc.)
  • the data requires JavaScript rendering

You can still export from Python to Excel—which is often the cleanest compromise.


Prerequisites (Windows Excel)

This tutorial assumes:

  • Excel on Windows (WinHTTP + MSHTML work best there)
  • You can open the VBA editor (ALT + F11)

References to enable

In the VBA editor:

  1. Tools → References
  2. Check:
    • Microsoft HTML Object Library

If you can’t enable references, you can still scrape via string parsing, but it’s more brittle.


Step 1: HTTP GET in VBA (WinHTTP)

WinHTTP is a solid choice for HTTP requests.

Option Explicit

Private Function HttpGet(ByVal url As String, Optional ByVal timeoutMs As Long = 30000) As String
    Dim http As Object
    Set http = CreateObject("WinHttp.WinHttpRequest.5.1")

    http.Open "GET", url, False

    http.SetTimeouts timeoutMs, timeoutMs, timeoutMs, timeoutMs

    http.SetRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123 Safari/537.36"
    http.SetRequestHeader "Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"
    http.SetRequestHeader "Accept-Language", "en-US,en;q=0.9"

    http.Send

    If http.Status < 200 Or http.Status >= 300 Then
        Err.Raise vbObjectError + 101, "HttpGet", "HTTP " & http.Status & " for " & url
    End If

    HttpGet = http.ResponseText
End Function

Step 2: Parse HTML (MSHTML)

For HTML pages, MSHTML can parse and let you query DOM-ish elements.

Private Function HtmlDoc(ByVal html As String) As Object
    Dim doc As Object
    Set doc = CreateObject("htmlfile")

    doc.Open
    doc.Write html
    doc.Close

    Set HtmlDoc = doc
End Function

Extracting elements

Private Function GetInnerTextBySelector(ByVal doc As Object, ByVal tagName As String, ByVal className As String) As String
    Dim el As Object
    Dim els As Object

    Set els = doc.getElementsByTagName(tagName)

    For Each el In els
        If InStr(1, " " & el.className & " ", " " & className & " ", vbTextCompare) > 0 Then
            GetInnerTextBySelector = Trim(el.innerText)
            Exit Function
        End If
    Next el

    GetInnerTextBySelector = ""
End Function

VBA doesn’t have great CSS selectors, so you typically:

  • search by tag
  • filter by className or attributes
  • keep the parsing logic simple

Step 3: Add retry + exponential backoff

Retries matter more than people think. The web is flaky.

Private Function HttpGetWithRetry(ByVal url As String, Optional ByVal attempts As Integer = 5) As String
    Dim i As Integer
    Dim delayMs As Long

    delayMs = 1000

    For i = 1 To attempts
        On Error GoTo TryFail
        HttpGetWithRetry = HttpGet(url, 30000)
        Exit Function

TryFail:
        On Error GoTo 0
        If i = attempts Then
            Err.Raise vbObjectError + 102, "HttpGetWithRetry", "Failed after retries: " & url
        End If

        ' Backoff (1s, 2s, 4s, 8s...)
        Sleep delayMs
        delayMs = delayMs * 2
    Next i
End Function

Sleep is not built in by default. Add this at the top of a module:

#If VBA7 Then
    Private Declare PtrSafe Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As LongPtr)
#Else
    Private Declare Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)
#End If

Step 4: Pagination pattern (collect multiple pages)

Most pagination boils down to:

  • find the “next page” URL
  • loop until maxPages

In VBA, the easiest option is often to build URLs yourself.

Example pattern:

  • https://example.com/list?page=1
  • ...page=2
Public Sub ScrapePaginatedList()
    Dim baseUrl As String
    Dim page As Integer
    Dim html As String

    baseUrl = "https://example.com/list?page="

    For page = 1 To 5
        html = HttpGetWithRetry(baseUrl & CStr(page), 5)
        ' parse html and write rows to sheet
        Debug.Print "page", page, "len", Len(html)
    Next page
End Sub

ProxiesAPI: how to use a proxy from WinHTTP

When you start seeing 403/429 or timeouts, a proxy helps.

With ProxiesAPI, you typically get a proxy endpoint like:

  • http://USER:PASS@gateway.proxiesapi.com:PORT

In WinHTTP you can set a proxy with SetProxy.

Private Function HttpGetViaProxy(ByVal url As String, ByVal proxy As String) As String
    Dim http As Object
    Set http = CreateObject("WinHttp.WinHttpRequest.5.1")

    http.Open "GET", url, False

    ' proxy format for WinHTTP:
    ' "http=HOST:PORT;https=HOST:PORT"
    http.SetProxy 2, proxy

    http.SetRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123 Safari/537.36"

    http.Send

    If http.Status < 200 Or http.Status >= 300 Then
        Err.Raise vbObjectError + 103, "HttpGetViaProxy", "HTTP " & http.Status & " via proxy for " & url
    End If

    HttpGetViaProxy = http.ResponseText
End Function

Note about credentials

If your proxy uses username/password, you can embed them in the proxy URL for many clients.

For WinHTTP, credential handling varies; the simplest approach is often:

  • use a ProxiesAPI gateway that supports credentialed proxies
  • or use a local proxy client that manages auth

If you have a single ProxiesAPI proxy URL like http://USER:PASS@HOST:PORT, a common workaround is to configure a local forward proxy and authenticate there.


Write results into Excel (clean pattern)

Create a sheet called Data with headers in row 1.

Private Sub WriteRow(ByVal rowIndex As Long, ByVal title As String, ByVal url As String, ByVal price As String)
    With ThisWorkbook.Worksheets("Data")
        .Cells(rowIndex, 1).Value = title
        .Cells(rowIndex, 2).Value = url
        .Cells(rowIndex, 3).Value = price
    End With
End Sub

Then scrape:

Public Sub ScrapeExampleToSheet()
    Dim html As String
    Dim doc As Object
    Dim title As String

    html = HttpGetWithRetry("https://example.com/products", 5)
    Set doc = HtmlDoc(html)

    title = GetInnerTextBySelector(doc, "h1", "product-title")

    WriteRow 2, title, "https://example.com/products", ""
End Sub

Common failure modes (and fixes)

1) You’re scraping a JS-rendered site

If the data isn’t in the HTML source, VBA can’t “see” it.

Fix:

  • scrape a JSON endpoint instead
  • or switch to Python + Playwright

2) Your macro hangs

Fix:

  • always set timeouts
  • always implement retries

3) You’re getting blocked

Fix:

  • slow down
  • rotate IPs (ProxiesAPI)
  • keep sessions/cookies

Comparison: VBA vs Python for scraping

CapabilityVBA (Excel)Python
RequestsOK (WinHTTP)Excellent (requests/httpx)
HTML parsingOK (MSHTML)Excellent (BeautifulSoup/lxml)
ConcurrencyPainfulStraightforward
JS renderingBasically noPlaywright/Selenium
MaintainabilityLowHigh
Deploy as a serviceNoYes

If this is a one-off internal report, VBA is fine.

If you want a durable pipeline, Python wins.


Where ProxiesAPI fits (honestly)

VBA doesn’t give you many levers:

  • you can’t easily randomize TLS fingerprints
  • you can’t easily do smart concurrency
  • error handling is primitive

So when a site blocks you, the fastest improvement is often: route your requests through a reliable proxy.

That’s where ProxiesAPI helps — not as magic, but as a stable proxy layer while you keep the rest of your macro simple.

Excel scrapers break first — ProxiesAPI helps them break less

VBA is fragile under blocking and timeouts. If you must scrape from Excel, ProxiesAPI gives you a stable proxy endpoint you can route WinHTTP through to reduce 403/429 pain.

Related guides

Node.js Web Scraping with Cheerio: Quick Start Guide
A practical Cheerio + HTTP quick start: fetch with retries, parse real HTML selectors, paginate, and scale reliably with ProxiesAPI.
guide#nodejs#cheerio#web-scraping
Web Scraping with Rust: reqwest + scraper Crate Tutorial (2026)
A practical Rust scraping guide: fetch pages with reqwest, rotate proxies, parse HTML with the scraper crate, handle retries/timeouts, and export structured data.
guide#rust#web-scraping#reqwest
How to Download Images from URLs with Python (fast, reliable, and deduped)
A production-grade image downloader in Python: concurrency, retries, content-type validation, safe filenames, and checksum dedupe. Optional ProxiesAPI proxy support for rate-limited hosts.
guide#python#images#download
How to Scrape Data Without Getting Blocked: A Practical Playbook
A no-fluff anti-blocking guide: rate limits, fingerprints, retries/backoff, header hygiene, caching, and when proxy rotation (ProxiesAPI) is the simplest fix. Includes comparison tables and checklists.
guide#web-scraping#anti-block#proxies