Scraping Glassdoor Reviews with Python Selenium: How to click on Continue Reading in a Glassdoor Rev-CodePudding

I´m scraping Glassdoor reviews and can´t reach the data, that is written to advice to management. I tryed to reach it the same way I reach the data for "pros" and "cons".

    def scrape_pros(gdReview):
        try:
            res = gdReview.find_element(By.XPATH, './/span [@data-test="pros"]').text
        except Exception:
            res = 0
        return res
       
    def scrape_cons(gdReview):
        try:
            res = gdReview.find_element(By.XPATH, './/span [@data-test="cons"]').text
        except Exception:
            res = 0
        return res

But the data can´t be found, because first the "Continue Reading" needs to be opened(by clicking on it), so the class for "Advice to Management" will be found. html code of continue reading

<div >Continue reading</div>

I tried to click "continue reading" in many different ways. Here are two examples I tried many combinations of:

gdReview.find_element(By.XPATH, './/div[@]').click()
gdReview.find_element(By.XPATH, './/div[@class ="row mt-xsm mx-0"]/preceding-sibling::div[text()="Continue Reading"]').click()

But those didn´t work.

I would really appreciate some help.

CodePudding user response：

The element with text Continue Reading is within a <div> element with the textContext as Continue reading.

Solution

To click on the clickable element you need to induce WebDriverWait for the element_to_be_clickable() and you can use either of the following locator strategies:

Using XPATH and the textContext as Continue reading:

WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[contains(., 'Continue reading')]"))).click()

Using XPATH and the v2__EIReviewDetailsV2__continueReading class:

WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[contains(@class, 'v2__EIReviewDetailsV2__continueReading') and contains(., 'Continue reading')]"))).click()

Using XPATH and the v2__EIReviewDetailsV2__clickable class:

WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[contains(@class, 'v2__EIReviewDetailsV2__clickable') and contains(., 'Continue reading')]"))).click()

Note: You have to add the following imports :

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC