Home > Net >  How to select the html attribute inside a selenium object
How to select the html attribute inside a selenium object

Time:02-26

I am learning web scraping using selenium and I've come into an issue when trying to select an attribute inside of a selenium object. I can get the broader data if I just print elems.text inside the loop (this outputs the whole paragraph for each listing) however when I try to access the xpath of the h2 title tag of all the listings inside this broader element, it only appends the first listing to the titles array, whereas I want all of them. I checked the XPATH and they are the same for each listing. How can I get all of the listings instead of just the first one?

titles = []
driver.get("https://www.sellmytimesharenow.com/timeshare/All Timeshare/vacation/buy-timeshare/")

results = driver.find_elements(By.CLASS_NAME, "results-list")

for elems in results:
    print(elems.text) #this prints out full description paragraphs
    elem_title = elems.find_element(By.XPATH, '//*[@id="search-page"]/div[3]/div/div/div[2]/div/div[2]/div/a[1]/div/div[1]/div/h2')
    titles.append(elem_title.text)

CodePudding user response:

If you aren't limited to accessing the elements by XPATH only, then here is my solution:

results = driver.find_elements(By.CLASS_NAME, "result-box")
for elems in results:
    titles.append(elems.text.split("\n")[0])

When you try getting the listings, you use find_elements(By.CLASS_NAME, "results-list"), but on the website, there is only one element with the class name "results-list". This aggregates all the text in this div into one long string and therefore you can't get the heading.

However, there are multiple elements with the class name "result-box", so find_elements will store each as its own item in "results". Because the title of each listing is on the first line, you can split the text of each element by the newline.

  • Related