I'm trying to parse html
from this ecommerce website with selenium
, beautifulsoup
in python
to get following text
from product titles:
"Electric outlet 220V 32A IP44" and "34.60"
Problem:
I get empty values for my variables productCardTitle
and productPriceValue
.
This is my html:
<ul class="products-cards products-cards--view-grid grid">
<li data-uid="210a10d0-4976-4c94-9ab4-0e7d38459c02" class="products-cards__item product-card">
<div class="product-card__top">...</div>
<div class="product-card__content">...</div>
<div class="product-card__title-wrapper">
<h3 class="product-card__title">
<a href="/catalog/Electric-outlet-220V-32A-IP44">
Electric outlet 220V 32A IP44
</a>
</h3>
</div>
...
<div class="product-card__bottom">
<p class="product-card__price product-price product-price--small">...</p>
<p class="product-card__price product-price">
<span class="visually-hidden">Price:</span>
<span class="product-price__value">34.60</span>
<span class="product-price__currency">...</span>
...
</li>
</ul>
This is my code in python:
productCards = soup.find_all('li', class_="products-cards__item product-card")
for productCard in productCards:
productCardTitle = productCard.find_all('h3', class_="product-card__title")
for product in productCardTitle:
title = product.findChildren('a')[0].string.strip()
print(title)
productPriceValue = productCard.find_all('span', class_="product-price__value")
for product in productPriceValue:
price = product.string.strip()
print(price)
I would appreciate if someone could give me some help on how to solve this problem.
CodePudding user response:
Maybe, you want to collect parsed data:
data = []
productCards = soup.find_all('li', class_="products-cards__item product-card")
for productCard in productCards:
productCardTitle = productCard.find_all('h3', class_="product-card__title")
for product in productCardTitle:
title = product.findChildren('a')[0].string.strip()
productPriceValue = productCard.find_all('span', class_="product-price__value")
for product in productPriceValue:
price = product.string.strip()
data.append({'title': title, 'price': float(price)})
Output:
>>> data
[{'title': 'Electric outlet 220V 32A IP44', 'price': 34.6}]
CodePudding user response:
Not sure what is going wrong, that your results are empty. Tryed it with selenium
as well as with requests
and both provide valide information.
Anyway - It do not need that numbers of loops to grab the information, lets take a look how to make it more lean.
Select all cards like this:
soup.select('li.product-card')
Iterate once over the result set and create a list
of dicts
:
[
{'title': card.h3.get_text(strip=True),
'price': card.select_one('span.product-price__value').get_text()
}
for card in soup.select('li.product-card')
]
Example (selenium)
from bs4 import BeautifulSoup
from selenium import webdriver
driver = webdriver.Chrome('YOUR PATH TO CHROME')
driver.get('https://shop-aventa.ru/search?q= Разъем 220 ')
soup=BeautifulSoup(driver.page_source, 'html.parser')
data = [
{'title': card.h3.get_text(strip=True),
'price': card.select_one('span.product-price__value').get_text()
}
for card in soup.select('li.product-card')
]
data
Example (requests)
import requests
from bs4 import BeautifulSoup
headers ={
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36'
}
r =requests.get('https://shop-aventa.ru/search?q= Разъем 220 ')
soup=BeautifulSoup(r.content, 'html.parser')
data = [
{'title': card.h3.get_text(strip=True),
'price': card.select_one('span.product-price__value').get_text()
}
for card in soup.select('li.product-card')
]
data
Output
[{'title': 'Разъем 220 ЭКФ (Розетка 233), 63А, IP67', 'price': '934.60'}, {'title': 'Разъем 220 ЭКФ (Розетка 223), 32А, IP44', 'price': '286.00'}, {'title': 'Разъем 220 ЭКФ (Вилка 023), 32А, IP44', 'price': '239.40'}, {'title': 'Разъем 220 ЭКФ (Розетка 123), 32А, IP44', 'price': '342.80'}, {'title': 'Разъем 220 ЭКФ (Розетка 213), 16А, IP44', 'price': '200.70'}, {'title': 'Разъем 220 ЭКФ (Розетка 113), 16А, IP44', 'price': '269.70'}, {'title': 'Разъем 220 ЭКФ (Вилка 013), 16А, IP44', 'price': '161.50'}, {'title': 'Разъем 220 ЭКФ (Розетка 413), 16А, IP44', 'price': '269.20'}, {'title': 'Разъем 220В DKC 32А 220В, 2P E, наст. IP44', 'price': '595.50'}, {'title': 'Разъем 220 ИЭК (Вилка 513), 16А, IP44, MAGNUM', 'price': '533.58'}, {'title': 'Разъем 220 ИЭК (Вилка 033), 63А, IP67, MAGNUM', 'price': '1 727.41'}, {'title': 'Разъем 220 ИЭК (Розетка 223), 32А, IP 44', 'price': '348.21'}, {'title': 'Разъем 220 ИЭК (Розетка 113), 16А, IP44, MAGNUM', 'price': '427.15'}, {'title': 'Разъем 220 ИЭК (Розетка 133), 63А, IP67, MAGNUM', 'price': '2 607.28'}, {'title': 'Разъем 220 ИЭК (Розетка 233), 63А, IP44', 'price': '2 051.63'}, {'title': 'Разъем 220 ИЭК (Вилка 023), 32А, IP44, MAGNUM', 'price': '362.05'}, {'title': 'Разъем 220 ИЭК (Розетка 113), 16А, IP 44', 'price': '330.47'}, {'title': 'Разъем 220 ИЭК (Вилка 023), 32А, IP 44', 'price': '291.02'}, {'title': 'Разъем 220 ИЭК (Розетка 213), 16А, IP44, MAGNUM', 'price': '341.92'}, {'title': 'Разъем 220 ИЭК (Вилка 013), 16А, IP 44', 'price': '200.54'}, {'title': 'Разъем 220 ИЭК (Вилка 033), 63А, IP54', 'price': '1 460.27'}, {'title': 'Разъем 220 ИЭК (Розетка скрытая 413), 16А, IP44', 'price': '350.79'}, {'title': 'Разъем 220 ИЭК (Розетка 133), 63А, IP54', 'price': '2 069.92'}, {'title': 'Разъем 220 ИЭК (Розетка 423), 32А, IP44, MAGNUM', 'price': '567.41'}]