from bs4 import BeautifulSoup
import pandas as pd
from selenium import webdriver
driver = webdriver.Chrome()
driver.get('https://www.flaconi.de/haare/maria-nila/head-and-hair-
heal/maria-nila-head-and-hair-heal-haarshampoo.html#sku=80021856-
100')
soup = BeautifulSoup(driver.page_source,'html.parser')
soup.find('div', class_ = 'average-rating')
it returns nothing. I am sure there is a content from website
CodePudding user response:
That value is stored in a script tag. You can regex it out from response.text though I would escape the html entities first to make regex more readable
import requests, re, html
r = requests.get('https://www.flaconi.de/haare/maria-nila/head-and-hair-heal/maria-nila-head-and-hair-heal-haarshampoo.html#sku=80021856-100')
avg_rating = round(float(re.search(r'"ratingValue":(.*?),', html.unescape(r.text)).group(1)), 1)
print(avg_rating)
CodePudding user response:
There is no need to use BS here. Selenium can find the element no problem.
import pandas as pd
from selenium import webdriver
driver = webdriver.Chrome()
driver.get('https://www.flaconi.de/haare/maria-nila/head-and-hair-heal/maria-nila-head-and-hair-heal-haarshampoo.html#sku=80021856-100')
driver.find_element_by_class_name('average-rating').text
Output
4.8