Home > front end >  How to get the dynamic content using selenium via python after page load
How to get the dynamic content using selenium via python after page load

Time:04-06

I'm trying to grab content that loads dynamically using selenium via python3 after page load. I tried every solution I could find here but none of them works.

Specifically what I need is the value of href, but for now just being able to retrieve the entire page source with all the content after page-load would work as well.

Example of href value I need:

<a  href="/path1/path2/path3/lsdkfughjfsldkfghsdlf">

I tried the following:

WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".class1.class2")))

Which errors with this:

WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".Image-image")))
    File "C:\Users\Mike\PycharmProjects\pythonProject\venv\lib\site-packages\selenium\webdriver\support\wait.py", line 89, in until
    raise TimeoutException(message, screen, stacktrace)

If I remove that and try this:

page_source = driver.page_source
driver.close()
with open(r"output.txt", "w") as f:        
   f.write(page_source)

Then I just get the loaded HTML page.

Additional configurations I am using that may be helpful in finding a solution:

s = Service("chromedriver.exe")
options = Options()
options.add_argument('user-agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) ''Chrome/94.0.4606.81 Safari/537.36')
driver = webdriver.Chrome(options=options,service=s)

Any direction would be greatly appreciated!

CodePudding user response:

As per the HTML:

<a  href="/path1/path2/path3/lsdkfughjfsldkfghsdlf">

Your locator strategy technically seems perfecto:

WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".class1.class2")))

However, simply using the values of the class attribute may not identify the element uniquely within the HTML DOM. In such cases you may require to construct a more canonical locator by adding the <tag_name> as well as the partial static value of the href attribute as follows:

print(WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a.class1.class2[href]"))).get_attribute("href"))

more canonically

print(WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a.class1.class2[href*='path']"))).get_attribute("href"))

CodePudding user response:

to address the below issue:

selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":"a.Image-image"}

Please check in the dev tools (Google chrome) if we have unique entry in HTML-DOM or not.

xpath that you should check :

//a[@class='Image-image']

Steps to check:

Press F12 in Chrome -> go to element section -> do a CTRL F -> then paste the xpath and see, if your desired element is getting highlighted with 1/1 matching node.

If this is unique //a[@class='Image-image'] then you need to check for the below conditions as well.

  1. Check if it's in any iframe/frame/frameset.

    Solution: switch to iframe/frame/frameset first and then interact with this web element.

  2. Check if it's in any shadow-root.

    Solution: Use driver.execute_script('return document.querySelector to have returned a web element and then operates accordingly.

  3. Make sure that the element is rendered properly before interacting with it. Put some hardcoded delay or Explicit wait and try again.

    Solution: time.sleep(5) or

    WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//a[@class='Image-image']"))).get_attribute("href")

  4. If you have redirected to a new tab/ or new windows and you have not switched to that particular new tab/new window, otherwise you will likely get NoSuchElement exception.

    Solution: switch to the relevant window/tab first.

  5. If you have switched to an iframe and the new desired element is not in the same iframe context then first switch to default content and then interact with it.

    Solution: switch to default content and then switch to respective iframe.

  • Related