Home > Mobile >  looping over unordered list without id in vba selenium
looping over unordered list without id in vba selenium

Time:10-23

I am trying to get the title and the price and print it into the excel sheet from an unordered html list using vba selenium.

I have tried various means and most of the tutorials have a nice clean html example with an id tag.

here is the vba code I have. It doesn't return an error, but it doesn't pull in anything.

Sub test2()
Dim driver As New WebDriver
Dim rowc, cc, columnC As Integer
rowc = 2
Application.ScreenUpdating = False
driver.Start "chrome"
driver.Get "https://www.fence-depot.com/fence/36-inch-auburn-residential-aluminum-fence/"
For Each li In driver.FindElementByClass("wys-right").FindElementByTag("h4").FindElementsByTag("ul")
cc = 1
For Each t In li.FindElementsByTag("li")
Sheet1.Cells(1, cc).Value = t.Text
cc = cc   1
Next t
Next li

Application.Wait Now   TimeValue("00:00:20")
End Sub

this is the html code I am trying to get to

<div class="wys-right" style="float: left; width: 350px; padding-bottom: 10px; min-height: 520px;">
<h4>Pricing:</h4>
<ul>
<li><span style="color: #808080;">6' Wide Panel =</span>&nbsp;<strong><span style="color: #000000;">$65.01 (2 Rail)&nbsp;</span></strong></li>
<li><span style="color: #888888;">6' Wide Panel = <strong><span style="color: #000000;">$74.91 (3 Rail)</span></strong></span></li>
<li><span style="color: #808080;">2" x 2" x .062" Lines, Ends, and Corner <span style="text-decoration: underline; color: #ff6600;"><span style="text-decoration: underline;"><a title="Punched Posts for an Aluminum Fence" href="https://fence-depot.com/punched-posts-for-an-aluminum-fence/?blog=cat" target="_blank" rel="noopener noreferrer"><span style="color: #ff6600; text-decoration: underline;">Posts</span></a> </span></span>=&nbsp;</span><span style="color: #000000;"><strong>$21.05</strong></span></li>
<li><span style="color: #808080;">2.5" x 2.5" Upgraded Lines, Ends, Corner Posts =</span> <strong><span style="color: #000000;">$34.05</span></strong></li>
<li><span style="color: #808080;">2" x 2" x .125" Gate Posts =</span> <strong><span style="color: #000000;">$36.53</span></strong></li>
<li><span style="color: #000000;"><span style="color: #808080;">3" x 3" x .125" Upgraded Gate Posts =</span></span><strong><span style="color: #000000;"> $65.62</span></strong></li>
<li><span style="color: #000000;"><span style="color: #808080;">4" x 4" x .125" Upgraded Gate Posts = </span></span><strong><span style="color: #000000;">$76.15</span></strong></li>
<li><span style="color: #000000;"><span style="color: #808080;">6" x 6" x .125" Upgraded Gate Posts =</span></span><strong><span style="color: #000000;"> $122.58&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;</span></strong></li>
</ul>

CodePudding user response:

I made some changes that might be what you want. If it is not, you should be able to decipher it.

Mostly your trouble was that UL is not a child of H4, so it was never going to find it. Also, I don't know that the object model identifies a UL element as a collection of LI elements. I think you have to explicitly find the LI tags. Could be wrong.

Sub test2()
Dim driver As New webdriver
Dim rowc, cc, columnC As Integer
    
    rowc = 2
    Application.ScreenUpdating = False
    
    driver.Start "chrome"
    driver.Get "https://www.fence-depot.com/fence/36-inch-auburn-residential-aluminum-fence/"
    
    r = 0
    For Each ul In driver.FindElementByClass("wys-right").FindElementsByTag("ul")
        r = r   1: cc = 0
        For Each li In ul.FindElementsByTag("li")
            cc = cc   1
            Sheet1.Cells(r, cc).Value = li.Text
        Next li
    Next ul
    
    Application.ScreenUpdating = True
'    Application.Wait Now   TimeValue("00:00:20")
End Sub

CodePudding user response:

You can use xmlhttp rather than overhead of browser. You need to pass in a recognised user-agent header.

You can then target the appropriate li elements by gathering those within the first ul element with a parent with class wys-right. If you split the li text on " = " then you can write out price and description into separate columns.

Your mileage may vary depending on Office version. This is written using 2019.

You could similar logic, with Selenium, but pass the css selector list .wys-right ul:nth-of-type(1) li to FindElementsByCss() and then For Each the returned WebElements collection and split the .text of each WebElement.

Option Explicit

Public Sub WriteOutPrices()
    'tools > references > Microsoft HTML Object Library
    Dim html As MSHTML.HTMLDocument, xhr As Object
    
    Set xhr = CreateObject("MSXML2.XMLHTTP")
    Set html = New MSHTML.HTMLDocument

    With xhr
        .Open "GET", "https://www.fence-depot.com/fence/36-inch-auburn-residential-aluminum-fence/", False
        .setRequestHeader "User-Agent", "Safari/537.36"
        .send
        html.body.innerHTML = .responseText
        Debug.Print html.Title
    End With
    
    Dim ws As Worksheet, i As Long
    
    Set ws = ThisWorkbook.Worksheets(1)
    
    ws.Cells(1, 1).value = html.querySelector("h1").innerText

    With html.querySelectorAll(".wys-right ul:nth-of-type(1) li")
        For i = 0 To .length - 1
            Dim arr() As String
            arr = Split(.Item(i).innerText, " = ")
            ws.Cells(i   3, 1) = arr(0)
            ws.Cells(i   3, 2) = arr(1)
        Next
    End With
End Sub
  • Related