I am currently building a macro with VBA in order to automatically obtain te magnetic declination of a certain point on Earth. I managed to automate the process of filling in the right coordinates and make sure the output is in the right format. First, I wanted to have the output as a CSV-file. The process of automatically downloading that file and then subtracting the correct number was working. However, I know want to use the HTML option as output, since that avoids downloading files and thus makes the process (in my opinion) much faster. The problem I now have, is related to subtracting the magnetic variation from the HTML pop-up. When I am looking into the source code of the webpage, it is unclear to me how to properly obtain the right number, since this number doesn't have an ID and the table it is displayed in has the same class name as the input table...
I think I need to, somehow, loop over the elements of the output table and in that way obtain the magnetic variation.
The magnetic variation is calculated via: https://www.ngdc.noaa.gov/geomag/calculators/magcalc.shtml#declination
The number I am trying to obtain has the following position in the source code:
The piece of VBA code I use to automatically generate the output:
Sub ScrapeWebData()
Dim ieObj As InternetExplorer
Dim htmlEle As IHTMLElement
Dim i As Integer
i = 2
Set IE = CreateObject("InternetExplorer.Application")
'Set ieObj = New InternetExplorer
IE.Visible = True
IE.navigate "https://www.ngdc.noaa.gov/geomag/calculators/magcalc.shtml"
Application.Wait Now TimeValue("00:00:05")
Set doc = IE.document
'Import right latitude and longitude into the Magnetic Declination Calculator
'Fill in the latitude
doc.getElementById("declinationLat1").Value = ThisWorkbook.Sheets("Sheet1").Range("A1").Value
'Choose North or South
If ThisWorkbook.Sheets("Sheet1").Range("B1").Value = "S" Then
doc.all.Item("lat1Hemisphere")(0).Checked = True
End If
If ThisWorkbook.Sheets("Sheet1").Range("B1").Value = "N" Then
doc.all.Item("lat1Hemisphere")(1).Checked = True
End If
'Fill in the longitude
doc.getElementById("declinationLon1").Value = ThisWorkbook.Sheets("Sheet1").Range("C1").Value
'Choose West or East
If ThisWorkbook.Sheets("Sheet1").Range("D1").Value = "W" Then
doc.all.Item("lon1Hemisphere")(0).Checked = True
End If
If ThisWorkbook.Sheets("Sheet1").Range("D1").Value = "E" Then
doc.all.Item("lon1Hemisphere")(1).Checked = True
End If
'Choose format of the output file to be HTML
doc.all.Item("resultFormat")(0).Checked = True
doc.getElementById("calcbutton").Click
'Here I tried to obtain the right number from the output, but it doesn't work sadly.
For Each htmlEle In IE.document.getElementById("declinationResultContents")
For Each htmlnew In IE.document.getElementsByClassName("shadow")(4).getElementsByTagName("tr")
With ActiveSheet
.Range("A" & i).Value = htmlnew.Children(0).textContent
End With
i = i 1
Next htmlnew
Next htmlEle
End Sub
Here, I use the following cells in my excel sheet: enter image description here
Is someone familiar with web scraping of HTML-tables where the desired number (and table) doesn't have a clear identifier? Could you please help me? Thanks in advance!!
CodePudding user response:
Actually I didn't want to give any more answers about web scraping with IE. But since this page still seems to run flawlessly, I took a look at how the popup works.
But first a hint:
The first line in every VBA module should be Option Explicit
. This will first check if all variables are declared when starting a macro. This is sometimes annoying, but can avoid very time-consuming error searches just because there is a typo.
There is more to optimize in your macro, but all in all it works. I also looked at the whole thing because you commented the code in such a way that I understood directly what you wanted to do. That is very pleasant.
Delete the following code block. It does not work like this:
'Here I tried to obtain the right number from the output, but it doesn't work sadly.
For Each htmlEle In IE.document.getElementById("declinationResultContents")
For Each htmlnew In IE.document.getElementsByClassName("shadow")(4).getElementsByTagName("tr")
With ActiveSheet
.Range("A" & i).Value = htmlnew.Children(0).textContent
End With
i = i 1
Next htmlnew
Next htmlEle
Replace it with this code. You should be able to use it to read the data you really need:
'The needed data stands in the last html table of the document
'This table will build when clicking the button
'So we have to wait until the browser has generated the code
'(It would be better to use a loop here and not Application.Wait
'But I use the quick and dirty way here ;-)
Application.Wait Now TimeValue("00:00:05")
Set htmlEle = IE.document.getElementsByTagName("table")
MsgBox htmlEle(htmlEle.Length - 1).innertext
Edit
Here is the whole macro like it works for me. I started to implement the loop, so that Application.Wait
does not have to be used. But that leads to a follow up problem on the page. This can probably also be solved, but the effort is relatively large. So I deleted it to keep the code clear.
Sub ScrapeWebData()
Const url As String = "https://www.ngdc.noaa.gov/geomag/calculators/magcalc.shtml"
Dim IE As Object
Dim doc As Object
Dim htmlEle As Object
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
IE.navigate url
Application.Wait Now TimeValue("00:00:10")
Set doc = IE.document
'Import right latitude and longitude into the Magnetic Declination Calculator
'Fill in the latitude
doc.getElementById("declinationLat1").Value = ThisWorkbook.Sheets("Sheet1").Range("A1").Value
'Choose North or South
If ThisWorkbook.Sheets("Sheet1").Range("B1").Value = "S" Then
doc.all.Item("lat1Hemisphere")(0).Checked = True
End If
If ThisWorkbook.Sheets("Sheet1").Range("B1").Value = "N" Then
doc.all.Item("lat1Hemisphere")(1).Checked = True
End If
'Fill in the longitude
doc.getElementById("declinationLon1").Value = ThisWorkbook.Sheets("Sheet1").Range("C1").Value
'Choose West or East
If ThisWorkbook.Sheets("Sheet1").Range("D1").Value = "W" Then
doc.all.Item("lon1Hemisphere")(0).Checked = True
End If
If ThisWorkbook.Sheets("Sheet1").Range("D1").Value = "E" Then
doc.all.Item("lon1Hemisphere")(1).Checked = True
End If
'Choose format of the output file to be HTML
doc.all.Item("resultFormat")(0).Checked = True
doc.getElementById("calcbutton").Click
'The needed data stands in the last html table of the document
'This table will build when clicking the button
'So we have to wait until the browser has generated the code
Application.Wait Now TimeValue("00:00:05")
Set htmlEle = IE.document.getElementsByTagName("table")
'You can write the following result were ever you want
'Here it's a message box, but you can use also a cell
'or Debug.Print for testing senses
MsgBox htmlEle(htmlEle.Length - 1).getElementsByTagName("tbody")(0).getElementsByTagName("td")(1).innertext
End Sub