I'm having some issues getting the div id's from an element I have scraped. Let's say this is the website code in the area I have pulled via class name "products"
There are id's also in the children of these listed id's I do not want those. Nor the ones on the full page.
<div class = "products">
<div>
<div> id= "123">...</div>
<div> id= "456">...</div>
<div> id= "789">...</div>
<div> id= "012">...</div>
</div>
Here is my selenium/python code. I've tried several solutions but I've been unable to find the answer.
driver.get("https://genericwebsite.org")
scrape = driver.find_element(By.CLASS_NAME, 'products')
# Here are some solutions I've tried to no avail
ids = scrape.find_elements(By.XPATH, "./child::*").get_attribute('id')
# This one pulls ALL IDS from the children's children as well as maybe the full sites.
ids = scrape.find_elements(By.XPATH,'//*[@id]' )
I've tried many iterations but can't seem to find the solution. My desired result is as follows:
ids = ["123", "456", "789", "012]
CodePudding user response:
try this code:
ids = []
scrape = driver.find_elements(by=By.XPATH, '//div[@class=\'products\']/div/div')
for item in scrape:
ids.append(item.get_attribute('id'))
print(ids)
with this code you enter each and every div
mentioned above and get their respective id with get_attribute
method and append it to the ids list