Home > Software engineering >  How can we use http.Open "GET" to list items from a table in HTML?
How can we use http.Open "GET" to list items from a table in HTML?

Time:08-08

I'm testing an idea that I had. It seems like I should be able to scrape out various HTML elements from a table in a website, but my code can't seem to find the table, which definitely seems to be there.

Sub TryThis()

Dim oHtml As HTMLDocument
Dim oElement As Object

Set oHtml = New HTMLDocument

With CreateObject("WINHTTP.WinHTTPRequest.5.1")
    .Open "GET", "https://en.wikipedia.org/wiki/List_of_countries_and_dependencies_by_population", False
    .send
    oHtml.body.innerHTML = .responseText
End With

Set myitem = oHtml.getElementsByClassName("wikitable sortable jquery-tablesorter")

i = 0
For Each oElement In myitem
    Sheets("Sheet1").Range("A" & i   1) = myitem(i).innerText
    i = i   1
Next oElement

End Sub

Essentially, I would like to loop through HTML items, print out, in cells, what is in the table named 'wikitable sortable jquery-tablesorter' Here is a screen shot that may help.

enter image description here

CodePudding user response:

You were really close, I think the issue is the jquery-tablesorter class is being added by jQuery (or plugin) after the page is loaded via JS. So that class isn't present in the DOM when the content is pulled in by the web request, it's added after. So removing it from the search criteria, should fix the issue.

Here's what I came up to address this, and also move the table contents over a bit quicker. I just did the first instance of wikitable sortable classes, but should be possible to loop each table too.

Sub TryThis()

    Dim oHtml As HTMLDocument
    Dim oElement As Object
    Dim htmlText As String
    
    Set oHtml = New HTMLDocument
    
    With CreateObject("WINHTTP.WinHTTPRequest.5.1")
        .Open "GET", "https://en.wikipedia.org/wiki/List_of_countries_and_dependencies_by_population", False
        .send
        oHtml.body.innerHTML = .responseText
    End With
    
    htmlText = oHtml.getElementsByClassName("wikitable sortable")(0).outerhtml
    
    With CreateObject("new:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}") 'Clipboard
        .SetText htmlText
        .PutInClipboard
        Sheets(1).Range("A1").Select
        Sheets(1).PasteSpecial Format:="Unicode Text"
    End With
    
End Sub
  • Related