I am trying to get the title and the price and print it into the excel sheet from an unordered html list using vba selenium.
I have tried various means and most of the tutorials have a nice clean html example with an id tag.
here is the vba code I have. It doesn't return an error, but it doesn't pull in anything.
Sub test2()
Dim driver As New WebDriver
Dim rowc, cc, columnC As Integer
rowc = 2
Application.ScreenUpdating = False
driver.Start "chrome"
driver.Get "https://www.fence-depot.com/fence/36-inch-auburn-residential-aluminum-fence/"
For Each li In driver.FindElementByClass("wys-right").FindElementByTag("h4").FindElementsByTag("ul")
cc = 1
For Each t In li.FindElementsByTag("li")
Sheet1.Cells(1, cc).Value = t.Text
cc = cc 1
Next t
Next li
Application.Wait Now TimeValue("00:00:20")
End Sub
this is the html code I am trying to get to
<div class="wys-right" style="float: left; width: 350px; padding-bottom: 10px; min-height: 520px;">
<h4>Pricing:</h4>
<ul>
<li><span style="color: #808080;">6' Wide Panel =</span> <strong><span style="color: #000000;">$65.01 (2 Rail) </span></strong></li>
<li><span style="color: #888888;">6' Wide Panel = <strong><span style="color: #000000;">$74.91 (3 Rail)</span></strong></span></li>
<li><span style="color: #808080;">2" x 2" x .062" Lines, Ends, and Corner <span style="text-decoration: underline; color: #ff6600;"><span style="text-decoration: underline;"><a title="Punched Posts for an Aluminum Fence" href="https://fence-depot.com/punched-posts-for-an-aluminum-fence/?blog=cat" target="_blank" rel="noopener noreferrer"><span style="color: #ff6600; text-decoration: underline;">Posts</span></a> </span></span>= </span><span style="color: #000000;"><strong>$21.05</strong></span></li>
<li><span style="color: #808080;">2.5" x 2.5" Upgraded Lines, Ends, Corner Posts =</span> <strong><span style="color: #000000;">$34.05</span></strong></li>
<li><span style="color: #808080;">2" x 2" x .125" Gate Posts =</span> <strong><span style="color: #000000;">$36.53</span></strong></li>
<li><span style="color: #000000;"><span style="color: #808080;">3" x 3" x .125" Upgraded Gate Posts =</span></span><strong><span style="color: #000000;"> $65.62</span></strong></li>
<li><span style="color: #000000;"><span style="color: #808080;">4" x 4" x .125" Upgraded Gate Posts = </span></span><strong><span style="color: #000000;">$76.15</span></strong></li>
<li><span style="color: #000000;"><span style="color: #808080;">6" x 6" x .125" Upgraded Gate Posts =</span></span><strong><span style="color: #000000;"> $122.58 </span></strong></li>
</ul>
CodePudding user response:
I made some changes that might be what you want. If it is not, you should be able to decipher it.
Mostly your trouble was that UL is not a child of H4, so it was never going to find it. Also, I don't know that the object model identifies a UL element as a collection of LI elements. I think you have to explicitly find the LI tags. Could be wrong.
Sub test2()
Dim driver As New webdriver
Dim rowc, cc, columnC As Integer
rowc = 2
Application.ScreenUpdating = False
driver.Start "chrome"
driver.Get "https://www.fence-depot.com/fence/36-inch-auburn-residential-aluminum-fence/"
r = 0
For Each ul In driver.FindElementByClass("wys-right").FindElementsByTag("ul")
r = r 1: cc = 0
For Each li In ul.FindElementsByTag("li")
cc = cc 1
Sheet1.Cells(r, cc).Value = li.Text
Next li
Next ul
Application.ScreenUpdating = True
' Application.Wait Now TimeValue("00:00:20")
End Sub
CodePudding user response:
You can use xmlhttp rather than overhead of browser. You need to pass in a recognised user-agent header.
You can then target the appropriate li
elements by gathering those within the first ul
element with a parent with class wys-right
. If you split the li
text on " = "
then you can write out price and description into separate columns.
Your mileage may vary depending on Office version. This is written using 2019.
You could similar logic, with Selenium, but pass the css selector list .wys-right ul:nth-of-type(1) li
to FindElementsByCss()
and then For Each
the returned WebElements
collection and split the .text
of each WebElement
.
Option Explicit
Public Sub WriteOutPrices()
'tools > references > Microsoft HTML Object Library
Dim html As MSHTML.HTMLDocument, xhr As Object
Set xhr = CreateObject("MSXML2.XMLHTTP")
Set html = New MSHTML.HTMLDocument
With xhr
.Open "GET", "https://www.fence-depot.com/fence/36-inch-auburn-residential-aluminum-fence/", False
.setRequestHeader "User-Agent", "Safari/537.36"
.send
html.body.innerHTML = .responseText
Debug.Print html.Title
End With
Dim ws As Worksheet, i As Long
Set ws = ThisWorkbook.Worksheets(1)
ws.Cells(1, 1).value = html.querySelector("h1").innerText
With html.querySelectorAll(".wys-right ul:nth-of-type(1) li")
For i = 0 To .length - 1
Dim arr() As String
arr = Split(.Item(i).innerText, " = ")
ws.Cells(i 3, 1) = arr(0)
ws.Cells(i 3, 2) = arr(1)
Next
End With
End Sub