I have a simple question that assumingly can be solved very easily. I however used some time now to extract the four lines of information as shown here:
I first try to access the <ul _ngcontent-xl-byg-c79=""
item to then loop over the <li _ngcontent-xl-byg-c79="" >
items in order to store the embedded information in my dataframe (columns are 'Mærke', 'Produkttype', 'Serie', and 'Model').
What do I do wrong? My problem is that the four lines have the same "class" name, which gives me the same output in all four loops.
This is my code:
from selenium import webdriver
import pandas as pd
# Activate web browser: External control
browser = webdriver.Chrome(r'C:\Users\KristerJens\Downloads\chromedriver_win32\chromedriver')
# Get webpage
browser.get("https://www.xl-byg.dk/shop/knauf-insulation-ecobatt-murfilt-190-mm-2255993")
# Get information
brand= []
product= []
series=[]
model=[]
for i in browser.find_elements_by_xpath("//ul[@class='short ng-star-inserted']/li"):
for p in i.find_elements_by_xpath("//span[@class='attribute-name']"):
brand.append(i.find_elements_by_class_name('?').text)
product.append(i.find_elements_by_class_name('?').text)
series.append(i.find_elements_by_class_name('?').text)
model.append(i.find_elements_by_class_name('?').text)
df = pd.DataFrame()
df['brand'] = brand
df['product'] = product
df['series'] = series
df['model'] = model
Any help is very appreciated!!
CodePudding user response:
Try like below and confirm:
from selenium import webdriver
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
driver = webdriver.Chrome(executable_path="path to chromedriver.exe")
driver.maximize_window()
driver.implicitly_wait(10)
driver.get("https://www.xl-byg.dk/shop/knauf-insulation-ecobatt-murfilt-190-mm-2255993")
wait = WebDriverWait(driver,30)
# Cookie pop-up
wait.until(EC.element_to_be_clickable((By.XPATH,"//button[@aria-label='Accept all' or @aria-label = 'Accepter alle']"))).click()
options = driver.find_elements_by_xpath("//div[@class='row-column']//ul[contains(@class,'short')]/li")
for opt in options:
attribute = opt.find_element_by_xpath("./span[@class='attribute-name']").text # Use a "." in the xpath to find element within in an element
value = opt.find_element_by_xpath("./*[contains(@class,'ng-star-inserted')]").text
print(f"{attribute} : {value}")
Mærke : Knauf Insulation
Produkttype : Murfilt
Serie : ECOBATT
Materiale : Glasmineraluld