Web Scraping with VBA: Extract Website Data into Excel (with Proxies + Retry Logic)

Apr 19, 2026 · guide · #vba, #excel, #web-scraping, #http, #html, #proxies, #retry

If you’ve ever tried web scraping in VBA, you already know the vibe:

It works… until it doesn’t.
One random timeout, and your macro is stuck.
One “Access denied” page, and your HTML parser happily extracts nonsense.

Still, scraping into Excel can be useful for:

pulling a daily price list
collecting a small list of URLs
building an internal report where Excel is the “source of truth”

This guide focuses on practical VBA scraping:

making HTTP requests from Excel (WinHTTP)
parsing HTML (MSHTML)
handling pagination
adding retry + backoff
routing requests via a ProxiesAPI proxy when sites rate-limit you

Target keyword: web scraping vba.

Excel scrapers break first — ProxiesAPI helps them break less

VBA is fragile under blocking and timeouts. If you must scrape from Excel, ProxiesAPI gives you a stable proxy endpoint you can route WinHTTP through to reduce 403/429 pain.

Get 1,000 free API calls View pricing

Reality check: when VBA is the wrong tool

VBA scraping is fine when:

you need < 100 pages/day
the site is simple HTML (no heavy JS rendering)
you don’t need concurrency

Switch to Python (or a dedicated scraper) when:

you need to scale (> 500 pages)
you hit bot protection (Cloudflare, PerimeterX, etc.)
the data requires JavaScript rendering

You can still export from Python to Excel—which is often the cleanest compromise.

Prerequisites (Windows Excel)

This tutorial assumes:

Excel on Windows (WinHTTP + MSHTML work best there)
You can open the VBA editor (ALT + F11)

References to enable

In the VBA editor:

Tools → References
Check:
- Microsoft HTML Object Library

If you can’t enable references, you can still scrape via string parsing, but it’s more brittle.

Step 1: HTTP GET in VBA (WinHTTP)

WinHTTP is a solid choice for HTTP requests.

Option Explicit

Private Function HttpGet(ByVal url As String, Optional ByVal timeoutMs As Long = 30000) As String
    Dim http As Object
    Set http = CreateObject("WinHttp.WinHttpRequest.5.1")

    http.Open "GET", url, False

    http.SetTimeouts timeoutMs, timeoutMs, timeoutMs, timeoutMs

    http.SetRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123 Safari/537.36"
    http.SetRequestHeader "Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"
    http.SetRequestHeader "Accept-Language", "en-US,en;q=0.9"

    http.Send

    If http.Status < 200 Or http.Status >= 300 Then
        Err.Raise vbObjectError + 101, "HttpGet", "HTTP " & http.Status & " for " & url
    End If

    HttpGet = http.ResponseText
End Function

Step 2: Parse HTML (MSHTML)

For HTML pages, MSHTML can parse and let you query DOM-ish elements.

Private Function HtmlDoc(ByVal html As String) As Object
    Dim doc As Object
    Set doc = CreateObject("htmlfile")

    doc.Open
    doc.Write html
    doc.Close

    Set HtmlDoc = doc
End Function

Extracting elements

Private Function GetInnerTextBySelector(ByVal doc As Object, ByVal tagName As String, ByVal className As String) As String
    Dim el As Object
    Dim els As Object

    Set els = doc.getElementsByTagName(tagName)

    For Each el In els
        If InStr(1, " " & el.className & " ", " " & className & " ", vbTextCompare) > 0 Then
            GetInnerTextBySelector = Trim(el.innerText)
            Exit Function
        End If
    Next el

    GetInnerTextBySelector = ""
End Function

VBA doesn’t have great CSS selectors, so you typically:

search by tag
filter by className or attributes
keep the parsing logic simple

Step 3: Add retry + exponential backoff

Retries matter more than people think. The web is flaky.

Private Function HttpGetWithRetry(ByVal url As String, Optional ByVal attempts As Integer = 5) As String
    Dim i As Integer
    Dim delayMs As Long

    delayMs = 1000

    For i = 1 To attempts
        On Error GoTo TryFail
        HttpGetWithRetry = HttpGet(url, 30000)
        Exit Function

TryFail:
        On Error GoTo 0
        If i = attempts Then
            Err.Raise vbObjectError + 102, "HttpGetWithRetry", "Failed after retries: " & url
        End If

        ' Backoff (1s, 2s, 4s, 8s...)
        Sleep delayMs
        delayMs = delayMs * 2
    Next i
End Function

Sleep is not built in by default. Add this at the top of a module:

#If VBA7 Then
    Private Declare PtrSafe Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As LongPtr)
#Else
    Private Declare Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)
#End If

Step 4: Pagination pattern (collect multiple pages)

Most pagination boils down to:

find the “next page” URL
loop until maxPages

In VBA, the easiest option is often to build URLs yourself.

Example pattern:

https://example.com/list?page=1
...page=2

Public Sub ScrapePaginatedList()
    Dim baseUrl As String
    Dim page As Integer
    Dim html As String

    baseUrl = "https://example.com/list?page="

    For page = 1 To 5
        html = HttpGetWithRetry(baseUrl & CStr(page), 5)
        ' parse html and write rows to sheet
        Debug.Print "page", page, "len", Len(html)
    Next page
End Sub

ProxiesAPI: how to use a proxy from WinHTTP

When you start seeing 403/429 or timeouts, a proxy helps.

With ProxiesAPI, you typically get a proxy endpoint like:

http://USER:PASS@gateway.proxiesapi.com:PORT

In WinHTTP you can set a proxy with SetProxy.

Private Function HttpGetViaProxy(ByVal url As String, ByVal proxy As String) As String
    Dim http As Object
    Set http = CreateObject("WinHttp.WinHttpRequest.5.1")

    http.Open "GET", url, False

    ' proxy format for WinHTTP:
    ' "http=HOST:PORT;https=HOST:PORT"
    http.SetProxy 2, proxy

    http.SetRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123 Safari/537.36"

    http.Send

    If http.Status < 200 Or http.Status >= 300 Then
        Err.Raise vbObjectError + 103, "HttpGetViaProxy", "HTTP " & http.Status & " via proxy for " & url
    End If

    HttpGetViaProxy = http.ResponseText
End Function

Note about credentials

If your proxy uses username/password, you can embed them in the proxy URL for many clients.

For WinHTTP, credential handling varies; the simplest approach is often:

use a ProxiesAPI gateway that supports credentialed proxies
or use a local proxy client that manages auth

If you have a single ProxiesAPI proxy URL like http://USER:PASS@HOST:PORT, a common workaround is to configure a local forward proxy and authenticate there.

Write results into Excel (clean pattern)

Create a sheet called Data with headers in row 1.

Private Sub WriteRow(ByVal rowIndex As Long, ByVal title As String, ByVal url As String, ByVal price As String)
    With ThisWorkbook.Worksheets("Data")
        .Cells(rowIndex, 1).Value = title
        .Cells(rowIndex, 2).Value = url
        .Cells(rowIndex, 3).Value = price
    End With
End Sub

Then scrape:

Public Sub ScrapeExampleToSheet()
    Dim html As String
    Dim doc As Object
    Dim title As String

    html = HttpGetWithRetry("https://example.com/products", 5)
    Set doc = HtmlDoc(html)

    title = GetInnerTextBySelector(doc, "h1", "product-title")

    WriteRow 2, title, "https://example.com/products", ""
End Sub

Common failure modes (and fixes)

1) You’re scraping a JS-rendered site

If the data isn’t in the HTML source, VBA can’t “see” it.

Fix:

scrape a JSON endpoint instead
or switch to Python + Playwright

2) Your macro hangs

Fix:

always set timeouts
always implement retries

3) You’re getting blocked

Fix:

slow down
rotate IPs (ProxiesAPI)
keep sessions/cookies

Comparison: VBA vs Python for scraping

Capability	VBA (Excel)	Python
Requests	OK (WinHTTP)	Excellent (requests/httpx)
HTML parsing	OK (MSHTML)	Excellent (BeautifulSoup/lxml)
Concurrency	Painful	Straightforward
JS rendering	Basically no	Playwright/Selenium
Maintainability	Low	High
Deploy as a service	No	Yes

If this is a one-off internal report, VBA is fine.

If you want a durable pipeline, Python wins.

Where ProxiesAPI fits (honestly)

VBA doesn’t give you many levers:

you can’t easily randomize TLS fingerprints
you can’t easily do smart concurrency
error handling is primitive

So when a site blocks you, the fastest improvement is often: route your requests through a reliable proxy.

That’s where ProxiesAPI helps — not as magic, but as a stable proxy layer while you keep the rest of your macro simple.

Excel scrapers break first — ProxiesAPI helps them break less

VBA is fragile under blocking and timeouts. If you must scrape from Excel, ProxiesAPI gives you a stable proxy endpoint you can route WinHTTP through to reduce 403/429 pain.

Get 1,000 free API calls View pricing

A practical guide to web scraping VBA in Excel: WinHTTP requests, HTML parsing with MSHTML, table extraction, pagination, retries, and where ProxiesAPI fits when websites start pushing back.

guide#vba#excel#web scraping vba

Web Scraping VBA: Extract Data from Websites into Excel

A practical guide to web scraping VBA in Excel: WinHTTP requests, HTML parsing with MSHTML, table extraction, pagination, retries, and where ProxiesAPI fits when websites start pushing back.

guide#vba#excel#web scraping vba

Web Scraping Excel: Import Website Data into Spreadsheets (No-Code + Power Query + VBA)

A practical guide to getting website data into Excel: Power Query (HTML tables + pagination), Office Scripts for scheduled pulls, and VBA for legacy flows—plus when you still need proxies and a Python pipeline.

guides#excel#power-query#office-scripts

Proxy Authentication for Web Scraping: Setup Patterns and Common Failures

Learn the practical proxy authentication patterns that actually matter in scraping systems, including URL credentials, auth headers, environment variables, and the failures that break crawls in production.

guide#proxies#authentication#web-scraping

Web Scraping with VBA: Extract Website Data into Excel (with Proxies + Retry Logic)

Related guides