Im very new to Python/Coding so please bare with me.
However, I'm trying to extract the text from the title of a webpage (input by the user), by scraping the 'webelement' of the page and finding its value using Selenium.
However, it keeps just returning the value 'none', instead of what I would expect to see (in this case 'BLACK BELTED WRAP COAT'.
Code can be found below:
title = driver.find_elements(By.XPATH,('/html/body/div[4]/div/div[3]/div[4]/div[1]/div[1]/form/div/div[2]/a/h2'))
//rest of code hidden but if you need more, please do let me know. (I'm new and don't want to spam)
Any idea what's causing this?
The source URL I'm entering is: https://www.riverisland.com/p/black-belted-wrap-coat-782866
This runs without error, but returns an unexpected value (as seen in below images).
enter image description here enter image description here
Appreciate it and apologies if I've missed anything. Ginge
CodePudding user response:
If you are trying to find an element use find_element
instead of find_elements
. find_elements
will return a list of webelements.
Try with below code:
Imports required for Explicit waits
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
driver.get("https://www.riverisland.com/p/black-belted-wrap-coat-782866")
wait = WebDriverWait(driver,30)
# Click on Accept cookies
wait.until(EC.element_to_be_clickable((By.NAME,"accept-all"))).click()
title = wait.until(EC.visibility_of_element_located((By.XPATH,"//h2[@data-localize='Product_Title']")))
print(title.text)
BLACK BELTED WRAP COAT
CodePudding user response:
To print the text BLACK BELTED WRAP COAT
you can use either of the following Locator Strategies:
Using
css_selector
andget_attribute("innerHTML")
:print(driver.find_element(By.CSS_SELECTOR, "h2.product-title.ui-product-title").get_attribute("innerHTML"))
Using
xpath
and text attribute:print(driver.find_element(By.XPATH, "//h2[@class='product-title ui-product-title']").text)
Ideally you need to induce WebDriverWait for the visibility_of_element_located()
and you can use either of the following Locator Strategies:
Using
CSS_SELECTOR
and text attribute:print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "h2.product-title.ui-product-title"))).text)
Using
XPATH
andget_attribute("innerHTML")
:print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//h2[@class='product-title ui-product-title']"))).get_attribute("innerHTML"))
Console Output:
BLACK BELTED WRAP COAT
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC
You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python
References
Link to useful documentation:
get_attribute()
methodGets the given attribute or property of the element.
text
attribute returnsThe text of the element.
- Difference between text and innerHTML using Selenium
CodePudding user response:
You have all been massively helpful!
I got this fixed, thanks.
Ginge