I have a webscraper for work that downloads all pdfs for given filters from our non-public website. I'm trying to name the files "ID Number File name date file was made.pdf" I used the absolute xPath
for the File name in a Try statement, but its not working and jumping to the exception. I would appreciate a second set of eyes to see if I have missed anything syntax-wise, or if there's a better way to implement this. I have also copied the xPath
as well to see if anyone with more experience can give me the relative xPath
to use
Error:
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"/html/body/div[1]/div[1]/div[4]/div/div[2]/div/div/div[2]/div/div/div[1]/div[1]/div[2]/h2"}
(Session info: chrome=93.0.4577.63)
HTML:
My code:
table_rows=driver.find_elements_by_xpath("//a[contains(@href, '#resources/details/?id=')]")
for link_elem in table_rows:
url = link_elem.get_attribute('href')
id_number= url[-8:]
driver.get(url)
try:
filename_first = driver.find_element_by_xpath('/html/body/div[1]/div[1]/div[4]/div/div[2]/div/div/div[2]/div/div/div[1]/div[1]/div[2]/h2').text.replace(':', '').replace(r'/', '-')
except:
filename_first = 'file.pdf'
#filename_first = driver.find_element_by_xpath('/html/body/div[1]/div[1]/div[4]/div/div[2]/div/div/div[2]/div/div/div[1]/div[1]/div[2]/h2').text.replace(':', '').replace(r'/', '-')
filename_final = id_number filename_first # '.pdf'
css_thing = '#file > div:nth-child(1) > div.form-group.padding-xs-bottom > div > div > button.btn.btn-danger.get-download-url'
time.sleep(5)
download_button = driver.find_element_by_css_selector(css_thing)
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, css_thing))).click()
time.sleep(5)
link_data = driver.find_element_by_xpath("//a[contains(@href, 'https://s3.amazonaws.com')]")
url = link_data.get_attribute("href")
r = requests.get(url, allow_redirects=True)
open(filename_final, 'wb').write(r.content)
print("good")
CodePudding user response:
You are getting NoSuchElement
exception because you are using absolute xPath
if DOM
is dynamic there will be lot of chances for your script failure.
Always use reliable xPath
xPath: //a//div[@class='flex-1 ellipsis padding-xs-right']
CodePudding user response:
Based on the snapshot that you've shared, I believe you can use the below xpath
//a[contains(@href,'#resources/details/')]//div[contains(@class,'ellipsis')]
Also, before using this xpath
check in Dev tools that we have 1/1
matching nodes.
Use it like this :
filename_first = driver.find_element_by_xpath("//a[contains(@href,'#resources/details/')]//div[contains(@class,'ellipsis')]").text
print(filename_first)
one you get the desired output with the above code, we can replace it regex to get what you actually looking for.