In the below HTML, my goal is to return zzde7e35d-8d9d-4763-95d2-9198684abb12
<div class = container>
<a data-type="patch" data-disable-with="Waiting" href="/market/opening/zzde7e35d-8d9d-4763-95d2-9198684abb12">Yes</a>
</div>
The problem is, I can't even seem to locate the URL within the div
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
link = example.url
driver.get(link)
URL = driver.find_element_by_xpath('//a[contains(@href,"market")]')
print(URL)
Printing the above, I seem to get a bunch of random characters unrelated to the HTML at all, let alone the URL in question.
If it simplifies the issue, the number of characters that are returned will always be the same length, is indexing an easy work around?
CodePudding user response:
If you want to get the href
you need to use get_attribute('href')
this will give you /market/opening/zzde7e35d-8d9d-4763-95d2-9198684abb12
and then split() this and you will get the last element.
link = example.url
driver.get(link)
URL = driver.find_element_by_xpath('//a[contains(@href,"market")]')
print(URL.get_attribute('href').split("/")[-1])
Output:
zzde7e35d-8d9d-4763-95d2-9198684abb12
CodePudding user response:
You are possibly missing a delay.
Instead of
URL = driver.find_element_by_xpath('//a[contains(@href,"market")]')
Try using
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
URL = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, '//a[contains(@href,"market")]'))).get_attribute("href")
print(URL)
Also you will have to extract the href
attribute value from the returned web element object as shown in the code.
In case this still not worked check if the element you are trying to access inside iframe etc. Or maybe the locator is not unique etc.
CodePudding user response:
To print the partial value of the href attribute i.e. zzde7e35d-8d9d-4763-95d2-9198684abb12
you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following locator strategies:
Using LINK_TEXT:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.LINK_TEXT, "Yes"))).get_attribute("href").split("/")[3])
Using CSS_SELECTOR:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "a.Blue-Button[data-type='patch'][data-disable-with='Waiting'][href*='market']"))).get_attribute("href").split("/")[3])
Using XPATH:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//a[@class='Blue-Button' and @data-type='patch'][@data-disable-with='Waiting' and contains(@href, 'market')]"))).get_attribute("href").split("/")[-1])
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC
You can find a relevant detailed discussion in Find div aria label starting with certain text and then extract