Home > Net >  How would one go about appending all values attached to a variable to one list?
How would one go about appending all values attached to a variable to one list?

Time:07-10

I have some code which grabs prices and other data from major reselling websites using bs4, then appends it to a JSON format. I want to append all the prices to one list, so I can average them and find the average retail price.

Unfortunately, everything I've tried only seems to create a different list for each price:

try:
  price = item.select_one('.s-item__price').text
except:
  price = None

        
        
value = Decimal(sub(r'[^\d.]', '', price))
a = str(value)
b = list(a.split())

Outputting b results in:

['20.00']
['199.95']
['48.99']
['100.00']
['119.00']
['19.99']
['35.00']
['85.00']
['39.00']
['27.66']
['75.00']

As shown, it returns multiple lists which cannot be summed. Outputting the price returns a similar result without brackets. I used Decimal to strip the prices of the localization symbol, achieving a decimal. I then converted this to a string because it gave me an error saying floats are not iterable. Using itertools also does not work.

How would one go about getting a format like the below?

prices = [20.00, 199.45, ... 75.00]

Apologies if this is an obvious question, I am new to this side of Python.

CodePudding user response:

This should work, I used list comprehension which is equivalent to using for loop like this:

try:
  price = item.select_one('.s-item__price').text
except:
  price = None   
        
value = Decimal(sub(r'[^\d.]', '', price))
a = str(value)
b = list(a.split())
prices = list()

for x in b:
    prices.append(float(x[0]))

Or like I did:

try:
  price = item.select_one('.s-item__price').text
except:
  price = None   
        
value = Decimal(sub(r'[^\d.]', '', price))
a = str(value)
b = list(a.split())
prices = [float(y) for x in b for y in x]

CodePudding user response:

Assuming you iterat a ResultSet with products, simply look for the price and append it to a list:

data = []
for item in soup.select('.s-item__wrapper'):
        data.append(
            re.sub(r'[^\d.]', '', item.find('span', {'class': 's-item__price'}).text) if item.find('span', {'class': 's-item__price'}) else None
        )

Example

import requests, re
from bs4 import BeautifulSoup

page = requests.get('https://www.ebay.co.uk/sch/i.html?_from=R40&_trksid=p2334524.m570.l1311&_nkw=python programming&_sacat=0&LH_TitleDesc=0&_odkw=python&_osacat=0')
soup=BeautifulSoup(page.text)

data = []
for item in soup.select('.s-item__wrapper'):
        data.append(
            re.sub(r'[^\d.]', '', item.find('span', {'class': 's-item__price'}).text) if item.find('span', {'class': 's-item__price'}) else None
        )
print(data)
Output
['20.00', '3.31', '5.22', '3.19', '4.52', '9.99', '5.71', '7.69', '3.54', '9.99', '3.30', '8.95', '5.63', '19.99', '33.53', '62.04', '32.16', '43.42', '5.00', '8.00', '3.74', '10.00', '7.40', '42.40', '25.03', '9.03', '11.22', '29.51', '11.86', '30.45', '33.80', '22.99', '44.94', '29.74', '7.68', '60.98', '23.81', '9.83', '15.28', '70.61', '28.67', '100.00', '16.75', '14.92', '13.33', '13.54', '36.66', '20.16', '6.42', '16.85', '20.00', '4.82', '18.99', '31.34', '19.30', '100.00', '29.66', '54.52', '10.64', '3.82', '100.00', '15.20', '27.18', '14.17', '20.00', '29.99', '27.00', '38.51', '18.98', '100.00', '106.97', '36.81']

Just in addition if you like to scrape multiple information - Recommend to avoid a bunch of lists, better work with a single list that contains structured data as dicts:

data = []
for item in soup.select('.s-item__wrapper')[1:]:
        data.append({
            'name':item.h3.text,
            'price':re.sub(r'[^\d.]', '', item.find('span', {'class': 's-item__price'}).text) if item.find('span', {'class': 's-item__price'}) else None
        })
  • Related