Home > Enterprise >  StaleElementReferenceException issue while scraping with Selenium
StaleElementReferenceException issue while scraping with Selenium

Time:10-18

I'm trying to load this page in full: https://candidat.pole-emploi.fr/offres/emploi/horticulteur/s1m1

I've set a line of code to handle the cookies popup.

Then I've set some lines to click on the Load More Results button in order to have the full html loaded and then printing it.

But I hit an error message after it clicked once :

StaleElementReferenceException: stale element reference: element is not attached to the page document

I don't know what it means nor how to fix it

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
import time

options = webdriver.ChromeOptions()
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')

site = 'https://candidat.pole-emploi.fr/offres/emploi/horticulteur/s1m1'
wd = webdriver.Chrome("C:\Program Files (x86)\chromedriver.exe", options=options)
wd.get(site)

time.sleep(10)

wait = WebDriverWait(wd, 10)

# click cookies popup
wd.find_element_by_xpath('//*[(@id = "description")]//*[contains(concat( " ", @class, " " ), concat( " ", "tc-open-privacy-center", " " ))]').click()

time.sleep(10)

# click show more button until no more results to load
while True:
    try:
        more_button = wait.until(EC.visibility_of_element_located((By.LINK_TEXT, 'AFFICHER LES 20 OFFRES SUIVANTES'))).click()
    except TimeoutException:
        break

time.sleep(10)

print(wd.page_source)
print("Complete")

time.sleep(10)
wd.quit()

CodePudding user response:

StaleElementReferenceException: stale element reference: element is not attached to the page document

Indicates that a reference to an element is now "stale" --- the element no longer appears on the DOM of the page. The reason for this expectation is may be your DOM got updated or refreshed. For an example, after performing an action like click() your DOM may get updated or refreshed. In this time when you are trying to find an element on DOM you will experience this error.

You have to re-find that element in updated or refreshed DOM

       try:  
            more_button = wait.until(EC.visibility_of_element_located((By.LINK_TEXT, 'AFFICHER LES 20 OFFRES SUIVANTES'))).click()  
     except StaleElementReferenceException:
            more_button = WebDriverWait(self.driver, 10).until(EC.visibility_of_element_located((By.LINK_TEXT, 'AFFICHER LES 20 OFFRES SUIVANTES')))
            more_button.click()

CodePudding user response:

There are many ways to handle stale element reference.

One is like try to re-click on the web element in a while loop.

Your link_text also looks wrong, Please use the below xpath :

# click cookies popup
driver.find_element_by_xpath('//*[(@id = "description")]//*[contains(concat( " ", @class, " " ), concat( " ", "tc-open-privacy-center", " " ))]').click()

time.sleep(10)

# click show more button until no more results to load
while True:
    try:
        more_button = wait.until(EC.visibility_of_element_located((By.XPATH, "//a[starts-with(@onclick,'tagDeClick') and contains(@href,'/offres/emploi.rechercheoffre:afficherplusderesultats')]")))
        ActionChains(driver).move_to_element(more_button).perform()
        attempts = 0
        while attempts < 2 :
            try:
                more_button.click()
                break
            except StaleElementReferenceException as exception:
                print(exception.msg)
            attempts = attempts    1

    except TimeoutException:
        break

time.sleep(10)

print(driver.page_source)
print("Complete")

time.sleep(10)

Output :

stale element reference: element is not attached to the page document
  (Session info: chrome=94.0.4606.81)

If you see this in logs, and you do not wish to see this, you will have to comment print(exception.msg).

Imports :

from selenium.webdriver.common.action_chains import ActionChains

CodePudding user response:

Try using the execute_script method i think it is the most reliable method to solve this kind of problem.

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException, NoSuchElementException
import time

options = webdriver.ChromeOptions()
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')

site = 'https://candidat.pole-emploi.fr/offres/emploi/horticulteur/s1m1'
wd = webdriver.Chrome("C:\Program Files (x86)\chromedriver.exe", options=options)
wd.get(site)

time.sleep(10)

wait = WebDriverWait(wd, 10)

# click cookies popup
wd.find_element_by_xpath('//*[(@id = "description")]//*[contains(concat( " ", @class, " " ), concat( " ", "tc-open-privacy-center", " " ))]').click()

time.sleep(10)

# click show more button until no more results to load
while True:
    try:
        wait.until(EC.visibility_of_element_located((By.LINK_TEXT, 'AFFICHER LES 20 OFFRES SUIVANTES')))
        more_button = wd.find_element_by_link_text('AFFICHER LES 20 OFFRES SUIVANTES')
        wd.execute_script('arguments[0].click()', more_button)
        #print('clicked')
    except (TimeoutException, NoSuchElementException):
        break

time.sleep(10)

print(wd.page_source)
print("Complete")

time.sleep(10)
wd.quit()
  • Related