I'm already getting the data that I need (from a print test) but I think the webdriver still continues to look for elements so it returns an error. I've included my code below, any help would be appreciated. Thanks!
import datetime
import pandas as pd
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome()
url = 'https://www.finder.com.au/home-loans'
driver.get(url)
driver.execute_script("window.scrollTo(0, 2080)")
get_today = datetime.datetime.now()
today = get_today.strftime('%d/%m/%Y')
affiliate = 'Finder'
rank = 1
results = []
loans = WebDriverWait(driver, 30).until(EC.presence_of_all_elements_located((By.XPATH, '//*[@id="comparison-table-0000000000"]//*')))
for i in range(1, len(loans)):
# loan_listing = WebDriverWait(driver, 30).until(EC.presence_of_element_located((By.XPATH, f'//*[@id="comparison-table-0000000000"]/tbody/tr[{i}]//a[1]'))).text
loan_listing = driver.find_element(By.XPATH, f'//*[@id="comparison-table-0000000000"]/tbody/tr[{i}]//a[1]').text
print(loan_listing.split(' ', 1))
It already prints the values I'm looking for:
But after a few secs also returns this error:
CodePudding user response:
The problem is:
This XPath '//*[@id="comparison-table-0000000000"]//*'
is too general, it matches more than 1500 elements on that page so that loans
list contain a lot of irrelevant elements.
After that you are printing the texts from elements matching this XPath: //*[@id="comparison-table-0000000000"]/tbody/tr[{i}]//a[1]
. There are 20 elements per page matching this. So the first 20 results are printed correctly but your loop continues since the length of loans
list is much longer.
So, to make your code better you can fix this line loans = WebDriverWait(driver, 30).until(EC.presence_of_all_elements_located((By.XPATH, '//*[@id="comparison-table-0000000000"]//*')))
to make it locate relevant elements only, as following:
loans = WebDriverWait(driver, 30).until(EC.presence_of_all_elements_located((By.XPATH, '//*[@id="comparison-table-0000000000"]/tbody/tr//a[@)))
Or even
loans = WebDriverWait(driver, 30).until(EC.presence_of_all_elements_located((By.XPATH, '//*[@id="comparison-table-0000000000"]//a[@]')))
Here I defined the a
element locator to be more precisely and now this locator matches 20 relevant elements only