Home > Mobile >  How to extract text from aria-label attribute?
How to extract text from aria-label attribute?

Time:01-17

So basically I am trying to work on webscraping. I need to scrap the work life balance rating from indeed website. But the challenge that I am facing is that I do not know how to extract the text from the aria-label, so I can get the output 4.0 out of 5 stars.

<div role="img" aria-label="4.0 out of 5 stars."><div ><div data-testid="filledStar" style="width:42.68px" ></div></div></div>

CodePudding user response:

You need to identify the element and use the get attribute aria-label to get the value.

If you are using python. code will be

print(diver.find_element(By.XPATH, "//div[@role='img']").get_attribute("aria-label"))

Update:

print(diver.find_element(By.XPATH, "//div[@role='img' and @aria-label]").get_attribute("aria-label"))

Or

print(diver.find_element(By.XPATH, "//div[@role='img' and @aria-label][.//div[@data-testid='filledStar']]").get_attribute("aria-label"))

CodePudding user response:

In case you can locate that element attribute value can be retrieven with selenium with get_attribute() method.
Let's say you are using By.CSS_SELECTOR and the locator is css_selector.
Python syntax is:

aria_label_value = driver.driver.find_element(By.CSS_SELECTOR, css_selector).get_attribute("aria-label")

Same can be done with other programming languages similarly with slight syntax changes

CodePudding user response:

To retrive the value of the aria-label attribute i.e. "4.0 out of 5 stars." you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following locator strategies:

  • Using CSS_SELECTOR and role="img":

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div[role='img'][aria-label]"))).get_attribute("aria-label"))
    
  • Using XPATH and data-testid="filledStar":

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[@data-testid='filledStar']//ancestor::div[@role='img' and @aria-label]"))).get_attribute("aria-label"))
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

You can find a relevant discussion in Python Selenium - get href value

  • Related