I´m scraping Glassdoor reviews and can´t reach the data, that is written to advice to management. I tryed to reach it the same way I reach the data for "pros" and "cons".
def scrape_pros(gdReview):
try:
res = gdReview.find_element(By.XPATH, './/span [@data-test="pros"]').text
except Exception:
res = 0
return res
def scrape_cons(gdReview):
try:
res = gdReview.find_element(By.XPATH, './/span [@data-test="cons"]').text
except Exception:
res = 0
return res
But the data can´t be found, because first the "Continue Reading" needs to be opened(by clicking on it), so the class for "Advice to Management" will be found. html code of continue reading
<div >Continue reading</div>
I tried to click "continue reading" in many different ways. Here are two examples I tried many combinations of:
gdReview.find_element(By.XPATH, './/div[@]').click()
gdReview.find_element(By.XPATH, './/div[@class ="row mt-xsm mx-0"]/preceding-sibling::div[text()="Continue Reading"]').click()
But those didn´t work.
I would really appreciate some help.
CodePudding user response:
The element with text Continue Reading is within a <div>
element with the textContext as Continue reading.
Solution
To click on the clickable element you need to induce WebDriverWait for the element_to_be_clickable() and you can use either of the following locator strategies:
Using XPATH and the textContext as
Continue reading
:WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[contains(., 'Continue reading')]"))).click()
Using XPATH and the v2__EIReviewDetailsV2__continueReading class:
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[contains(@class, 'v2__EIReviewDetailsV2__continueReading') and contains(., 'Continue reading')]"))).click()
Using XPATH and the v2__EIReviewDetailsV2__clickable class:
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[contains(@class, 'v2__EIReviewDetailsV2__clickable') and contains(., 'Continue reading')]"))).click()
Note: You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC