Home > Software design >  Webscraping: How to skip getelement when element/webpage doesn't exist?
Webscraping: How to skip getelement when element/webpage doesn't exist?

Time:02-13

I am scraping data from a website (thousands of different pages from the website), however, some of the webpages I try to navigate to don't exist or don't have the element/tag I need, these cases result in a bug prompt appearing and my code stops running. Because I am running through so many pages, I would prefer for my code just to return a blank/ignore them and move on.

I tried to use an iferror then null type statement for both the Price and WatchlistCount variables below, but this hasn't worked. Can you please help?

The code below is what I'm using to navigate to and scrape the webpages. The code before this creates the URLs that I need.

  Sub CM_Watchlist2000()

'Declaration
Dim Watchlist As Workbook
Dim Inp As Worksheet
Dim Inp2 As Worksheet
Dim ie As InternetExplorer
Dim doc As HTMLDocument
Dim i As Long
Dim iStart As Long

'Initialisation of Coinmarketcap and spreadsheet
Set Watchlist = Workbooks("Watchlist.xlsm")
Set Inp = Watchlist.Worksheets("Top 2000")
Set Inp2 = Watchlist.Worksheets("Lookup")
Set ie = New InternetExplorer

iStart = 0
   
For i = iStart To 100

Inp.Activate

    Range("A2").Offset(i, 0).Select
    Selection.Copy
    
Inp2.Activate

    Range("B1").Select
     Selection.PasteSpecial Paste:=xlPasteValues, Operation:=xlNone, SkipBlanks _
        :=False, Transpose:=True


ie.Visible = True
ie.Navigate Inp2.Range("D1")

'Waits until webpage is open before running macro
Do Until ie.ReadyState = READYSTATE_COMPLETE
   DoEvents
Loop


Set doc = ie.Document
Price = doc.getElementsByClassName("priceValue ")(0).innerText
WatchlistCount = doc.getElementsByClassName("namePill")(2).innerText

Inp.Activate

    Range("B2").Offset(i, 0) = Price
    Range("C2").Offset(i, 0) = WatchlistCount
   
Next i
   
End Sub

CodePudding user response:

Scrape Error Handling (IE)

  • This could possibly be improved if you could share the preceding code.
Option Explicit

Sub Scrape()
    
    Dim ErrNum As Long
    Dim j As Long
    
    For i = ? to ?
    
        Set doc = ie.Document
        
        If Not doc Is Nothing Then
            
            On Error Resume Next
                Price = doc.getElementsByClassName("priceValue ")(0).innerText
                WatchlistCount = doc.getElementsByClassName("namePill")(2).innerText
                ErrNum = Err.Number
            On Error GoTo 0
            
            If ErrNum = 0 Then
                inp.Range("B2").Offset(j, 0) = Price
                inp.Range("C2").Offset(j, 0) = WatchlistCount
                j = j   1
            Else
                ErrNum = 0
            End If
            
            Set doc = Nothing
        
        End If
    
    Next i
     
End Sub

CodePudding user response:

First, assign the element to an object variable, and then assign the price to your variable, if getElementsByClassName(...)(0) returns an element.

Dim priceElement As Object
Set priceElement = doc.getElementsByClassName("priceValue")(0)

If Not priceElement Is Nothing Then
    Dim price As String
    price = priceElement.innerText
End If
  • Related