So basically I am trying to work on webscraping. I need to scrap the work life balance rating from indeed website. But the challenge that I am facing is that I do not know how to extract the text from the aria-label, so I can get the output 4.0 out of 5 stars.
<div role="img" aria-label="4.0 out of 5 stars."><div ><div data-testid="filledStar" style="width:42.68px" ></div></div></div>
CodePudding user response:
You need to identify the element and use the get attribute aria-label
to get the value.
If you are using python. code will be
print(diver.find_element(By.XPATH, "//div[@role='img']").get_attribute("aria-label"))
Update:
print(diver.find_element(By.XPATH, "//div[@role='img' and @aria-label]").get_attribute("aria-label"))
Or
print(diver.find_element(By.XPATH, "//div[@role='img' and @aria-label][.//div[@data-testid='filledStar']]").get_attribute("aria-label"))
CodePudding user response:
In case you can locate that element attribute value can be retrieven with selenium with get_attribute()
method.
Let's say you are using By.CSS_SELECTOR
and the locator is css_selector
.
Python syntax is:
aria_label_value = driver.driver.find_element(By.CSS_SELECTOR, css_selector).get_attribute("aria-label")
Same can be done with other programming languages similarly with slight syntax changes
CodePudding user response:
To retrive the value of the aria-label
attribute i.e. "4.0 out of 5 stars." you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following locator strategies:
Using CSS_SELECTOR and
role="img"
:print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div[role='img'][aria-label]"))).get_attribute("aria-label"))
Using XPATH and
data-testid="filledStar"
:print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[@data-testid='filledStar']//ancestor::div[@role='img' and @aria-label]"))).get_attribute("aria-label"))
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC
You can find a relevant discussion in Python Selenium - get href value