Home > Net >  My 'price' variable of my parser is stuck on the first price it parsed
My 'price' variable of my parser is stuck on the first price it parsed

Time:11-02

i coded a parser, this one :

from bs4 import BeautifulSoup
import requests

source = requests.get('https://www.nike.com/fr/w/hommes-chaussures-nik1zy7ok').text
soup = BeautifulSoup(source, 'lxml')

csv_file = open('nikeshoes.csv', 'w', newline='')
csv_writer = csv.writer(csv_file)
csv_writer.writerow(['name', 'price', 'image_scr'])

for pair in soup.find_all('div', class_='product-card__body'):

    name = pair.a.text
    print(name)

    price = soup.find('div', class_='product-price css-11s12ax is--current-price').text
    print(price)

    try:
        image_scr = pair.select_one('img.css-1fxh5tw.product-card__hero-image')['src']
    except Exception as e:
        image_scr = None
    print(image_scr)

    print()
    csv_writer.writerow([name, price, image_scr])

csv_file.close()

The names and images links works fine and i get all of them but the price variable seems to be stuck on the first price of the web page, as shown bellow

enter image description here

It is weird because when i look through the website the price change in the html path i use. Does someone would know why the price is stuck on the first item ?

CodePudding user response:

You should try this:

for x in soup.find_all('div', class_='product-price css-11s12ax is--current-price'):
    print(x.text)

CodePudding user response:

What happens?

At this point and until the end of the list the price cannot be found also it is the same path.

Not really, there is a sale price and a regular price and structure of elements and classes is sligthly different -> Your selection wont work.

How to fix?

Select your elements more specific for example by attribute:

regular_price = pair.select_one('div[data-test="product-price"]').text.replace(u'\xa0', u' ') if pair.select('div[data-test="product-price"]') else None

You should also check, if element is available or not.

Example

from bs4 import BeautifulSoup
import requests

source = requests.get('https://www.nike.com/fr/w/hommes-chaussures-nik1zy7ok').text
soup = BeautifulSoup(source, 'lxml')

data = []
for pair in soup.find_all('div', class_='product-card__body'):

    name = pair.a.text.replace(u'\xa0', u' ')
    regular_price = pair.select_one('div[data-test="product-price"]').text.replace(u'\xa0', u' ') if pair.select('div[data-test="product-price"]') else None
    sale_price= pair.select_one('div[data-test="product-price-reduced"]').text.replace(u'\xa0', u' ') if pair.select('div[data-test="product-price-reduced"]') else None

    try:
        image_src = pair.select_one('img.css-1fxh5tw.product-card__hero-image')['src']
    except Exception as e:
        image_src = None
    
    data.append({
        'name':name,
        'regular_price':regular_price,
        'sale_price':sale_price,
        'image_src':image_src
    })

data 
  • Related