Nested Span element URL text value - Selenium-CodePudding

I am trying to get the value #2011 which is a URL text from the HTML below. I tried the below code but didnt work. It says it is unable to locate the class

driver.find_element(By.XPATH, '//span[@class = "data-issue-and-pr-hovercards-enabled"]').get_attribute('a')

Can anyone help to correct the mistake? I am new to selenium.

<span data-issue-and-pr-hovercards-enabled>
  <span><span> · Fixed by <a href="https://github.com/mlpack/mlpack/pull/2011" data-hydro-click="{&quot;event_type&quot;:&quot;issue_cross_references.click&quot;,&quot;payload&quot;:{&quot;reference_location&quot;:&quot;ISSUE_HEADER&quot;,&quot;user_id&quot;:8344482,&quot;issue_id&quot;:490664092,&quot;pull_request_id&quot;:315249262,&quot;originating_url&quot;:&quot;https://github.com/mlpack/mlpack/issues/2008&quot;}}" data-hydro-click-hmac="aadfd36202d72a8e9a7ce379994bac18a7e2052adfb12d43eea1c8f41e12bfde" data-hovercard-type="pull_request" data-hovercard-url="/mlpack/mlpack/pull/2011/hovercard">#2011</a></span><span></span></span>
</span>

Here is the link to the website - github.com/mlpack/mlpack/issues/2008 I want to get the #2011 which is next to the Fixed by Text (below the title of the issue). Is it possible to do this?

CodePudding user response：

Try the below XPath: This relative xpath will search for all tag names(*) which contains the text "#2011"

//*[contains(text(),'#2011')]

Or try the below one: Very similar explanation as above but this will search only within <a> tag

//a[contains(text(),'#2011')]

Update:

Try the below XPath:

//span[contains(text(),'Fixed by')]//a

Use .text method to fetch the required value. This will get you the below text value.

CodePudding user response：

In the given HTML data-issue-and-pr-hovercards-enabled is an attribute but not the value of class.

To extract the text #2011 ideally you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following locator strategies:

Using CSS_SELECTOR and text attribute:

print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "span[data-issue-and-pr-hovercards-enabled] a[data-hovercard-type='pull_request'][data-hovercard-url='/mlpack/mlpack/pull/2011/hovercard']"))).text)

Using XPATH and get_attribute("innerHTML"):

print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//span[@data-issue-and-pr-hovercards-enabled]//a[@data-hovercard-type='pull_request' and @data-hovercard-url='/mlpack/mlpack/pull/2011/hovercard']"))).get_attribute("innerHTML"))

Note : You have to add the following imports :

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python

References

Link to useful documentation:

get_attribute() method Gets the given attribute or property of the element.
text attribute returns The text of the element.
Difference between text and innerHTML using Selenium