I am extracting some data from URLhttps://blinkit.com/prn/catch-cumin-seedsjeera-whole/prid/56692 with unstructured Product Details elements.
Using this code:
product_details = wd.find_elements(by=By.XPATH, value="//div[@class='ProductAttribute__ProductAttributesDescription-sc-dyoysr-2 lnLDYa']")
info_shelf_life = product_details[0].text.strip()
info_country_of_origin = product_details[1].text.strip()
As you can see the Product details elements are unstructured and this approach is not suitable when the Index gets changed from URL to URL
Hence tried this approach, which throws out a NoSuchWindowException error.
info_shelf_life = wd.find_element(By.XPATH,value= "//div[[contains(@class, 'ProductAttribute__ProductAttributesDescription-sc-dyoysr-2 lnLDYa') and contains(., 'Shelf Life')]/..")
print(info_shelf_life.text.strip())
How can I extract text inside div based on text inside span tags?
CodePudding user response:
Your XPath is invalid. You can try
info_shelf_life = wd.find_element(By.XPATH, '//p[span="Shelf Life"]/following-sibling::div').text
info_country_of_origin = wd.find_element(By.XPATH, '//p[span="Country of Origin"]/following-sibling::div').text
to get required data