I use this method to scarp elements
name = driver.find_elements(By.XPATH, '//div[@]/a/em/font[3]/font')
but when I want inner product details then I have to move for scraping to that item page (Single Product page)
then I only access that item data but I want to scrap all the items data. It gives 1 item of data, but I want all the item's data.
I want to scrap these details that are indicated by the red color arrow by xpath
CodePudding user response:
To scrape internal data of the products, you will have to click on them one by one and then it will open in a new tab, so you will have to switch to a new tab then you should be able to scrape it.
Code:
driver.maximize_window()
wait = WebDriverWait(driver, 20)
driver.get("https://search.jd.com/Search?keyword=两件套套装裙&enc=utf-8&wq=两件套套装裙&pvid=c35452079d6240b3a5fab6c585b53856")
all_products = wait.until(EC.presence_of_all_elements_located((By.XPATH, "//img[@data-img and not(@data-url) and @height='220']")))
print(len(all_products))
i= 1
for product in all_products:
prd = wait.until(EC.visibility_of_element_located((By.XPATH, f"(//img[@data-img and not(@data-url) and @height='220'])[{i}]")))
driver.execute_script("arguments[0].scrollIntoView(true);", prd)
prd.click()
all_handles = driver.window_handles
driver.switch_to.window(all_handles[1])
print(wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.sku-name"))).get_attribute('innerText'))
print(wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "span.p-price"))).text)
driver.close()
driver.switch_to.window(all_handles[0])
i = i 1
Imports:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Website response is very slow, so I could not run the entire execution. However, the above code should work fine in your region.
Also, Stackoverflow is not letting me post the output as it contains some special chars.Please see the comment for the output.