We are working on extracting the img src address from the page.
<div >
<div >
<div ><img src="https://t3a.coupangcdn.com/thumbnails/remote/212x212ex/image/vendor_inventory/6ca9/2e097d911efc291473d0c47052cdc8f42d7b7b8f2a3ebbb0ccc974d76fe4.jpg" alt="product"><div><button type="button" >
</div></div>
<div >
<div >
<img src="https://thumbnail11.coupangcdn.com/thumbnails/remote/212x212ex/image/retail/images/239519218793467-6edc7d92-4165-4476-a528-fa238ffeeeb6.jpg" alt="product"><div></div></div>
I tried to get it in the following way
ele = driver.find_elements_by_xpath("//div[@class='product-picture']/img")
print(ele)
>>>
<selenium.webdriver.remote.webelement.WebElement (session="d9fd08b93bd5dd83fe520826c1f6fd77", element="27ef8c33-624d-4166-9dc7-3a355c4dcc32")>
<selenium.webdriver.remote.webelement.WebElement (session="d9fd08b93bd5dd83fe520826c1f6fd77", element="a6d77107-fecf-4c84-a048-9b4bda39b9df")>
<selenium.webdriver.remote.webelement.WebElement (session="d9fd08b93bd5dd83fe520826c1f6fd77", element="1f62cb8b-df58-4f06-afe6-6c60cb572527")>
What I want is the img src address string of every element on the page. Is there a way to extract a string?
CodePudding user response:
from selenium.webdriver.common.by import By
images = driver.find_elements(By.XPATH, "//div[@class='product-picture']/img")
for img in images:
print(img.get_attribute("src"))
This will give you the expected output :
https://t3a.coupangcdn.com/thumbnails/remote/212x212ex/image/vendor_inventory/6ca9/2e097d911efc291473d0c47052cdc8f42d7b7b8f2a3ebbb0ccc974d76fe4.jpg"
https://thumbnail11.coupangcdn.com/thumbnails/remote/212x212ex/image/retail/images/239519218793467-6edc7d92-4165-4476-a528-fa238ffeeeb6.jpg
CodePudding user response:
Try to use get_attribute('src') method to grab the src value
ele = driver.find_elements_by_xpath("//div[@class='product-picture']/img").get_attribute('src')
CodePudding user response:
You are using deprecated syntax. Please see python selenium DeprecationWarning: find_element_by_* commands are deprecated
The optimal way of locating elements which are likely to be lazy loading would be:
images = WebDriverWait(browser, 10).until(EC.presence_of_all_elements_located((By.XPATH, "//div[@class='product-picture']/img")))
for i in images:
print(i.get_attribute('src')
You will also need the following imports:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
Selenium docs can be found at https://www.selenium.dev/documentation/