I'm trying to grab content that loads dynamically using selenium via python3 after page load. I tried every solution I could find here but none of them works.
Specifically what I need is the value of href, but for now just being able to retrieve the entire page source with all the content after page-load would work as well.
Example of href value I need:
<a href="/path1/path2/path3/lsdkfughjfsldkfghsdlf">
I tried the following:
WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".class1.class2")))
Which errors with this:
WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".Image-image")))
File "C:\Users\Mike\PycharmProjects\pythonProject\venv\lib\site-packages\selenium\webdriver\support\wait.py", line 89, in until
raise TimeoutException(message, screen, stacktrace)
If I remove that and try this:
page_source = driver.page_source
driver.close()
with open(r"output.txt", "w") as f:
f.write(page_source)
Then I just get the loaded HTML page.
Additional configurations I am using that may be helpful in finding a solution:
s = Service("chromedriver.exe")
options = Options()
options.add_argument('user-agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) ''Chrome/94.0.4606.81 Safari/537.36')
driver = webdriver.Chrome(options=options,service=s)
Any direction would be greatly appreciated!
CodePudding user response:
As per the HTML:
<a href="/path1/path2/path3/lsdkfughjfsldkfghsdlf">
Your locator strategy technically seems perfecto:
WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".class1.class2")))
However, simply using the values of the class attribute may not identify the element uniquely within the HTML DOM. In such cases you may require to construct a more canonical locator by adding the <tag_name>
as well as the partial static value of the href
attribute as follows:
print(WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a.class1.class2[href]"))).get_attribute("href"))
more canonically
print(WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a.class1.class2[href*='path']"))).get_attribute("href"))
CodePudding user response:
to address the below issue:
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":"a.Image-image"}
Please check in the dev tools
(Google chrome) if we have unique entry in HTML-DOM
or not.
xpath that you should check :
//a[@class='Image-image']
Steps to check:
Press F12 in Chrome
-> go to element
section -> do a CTRL F
-> then paste the xpath
and see, if your desired element
is getting highlighted with 1/1
matching node.
If this is unique //a[@class='Image-image']
then you need to check for the below conditions as well.
Check if it's in any
iframe/frame/frameset
.Solution: switch to iframe/frame/frameset first and then interact with this web element.
Check if it's in any
shadow-root
.Solution: Use
driver.execute_script('return document.querySelector
to have returned a web element and then operates accordingly.Make sure that the element is rendered properly before interacting with it. Put some
hardcoded delay
orExplicit wait
and try again.Solution:
time.sleep(5)
orWebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//a[@class='Image-image']"))).get_attribute("href")
If you have redirected to a
new tab/ or new windows
and you have not switched to that particularnew tab/new window
, otherwise you will likely getNoSuchElement
exception.Solution: switch to the relevant window/tab first.
If you have switched to an iframe and the new desired element is not in the same iframe context then first
switch to default content
and then interact with it.Solution: switch to default content and then switch to respective iframe.