Home > database >  Average price of scraped item on ebay using python
Average price of scraped item on ebay using python

Time:08-28

How can I get the average price from a list of scraped items' prices from ebay?

This is my code:

from urllib.request import Request, urlopen
from bs4 import BeautifulSoup
import requests
from requests_html import HTMLSession

link = "https://www.ebay.co.uk/sch/i.html?_from=R40&_trksid=p2334524.m570.l1313&_nkw=5035224123933&_sacat=0&LH_TitleDesc=0&_odkw=EAN5035224123933&_osacat=0"

req = Request(link, headers={'User-Agent': 'Mozilla/5.0'})
webpage = urlopen(req).read()
with requests.Session() as c:
    
     soup = BeautifulSoup(webpage, 'html5lib')
     lists = soup.find_all('li', class_="s-item s-item__pl-on-bottom s-item--watch-at-corner")
     for list in(lists):
        price=float(list.find('span', class_="s-item__price").text.replace('£',''))
        avg = sum(price)/len(price)

I've tried:

avg = sum(price)/len(price)

But it gives an error:

TypeError: 'float' object is not iterable

CodePudding user response:

Assuming that the rest of your code works correctly, and retrieves the prices you need, the problem is with this:

    for list in(lists):
        price=float(list.find('span', class_="s-item__price").text.replace('£',''))
        avg = sum(price)/len(price)

You say price=float(..) - so yes, price is a floating point number and thus trying to sum() and len() it on the next line doesn't make sense to Python. You probably wanted to put all those prices in a list (e.g. prices) and then compute sum(prices) / len(prices)

Something like:

    prices = []
    for list in lists:
        prices.append(float(list.find('span', class_="s-item__price").text.replace('£','')))
    avg = sum(prices) / len(prices)

To understand why you got that error, consider:

>>> sum(1.0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'float' object is not iterable
>>> len(1.0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: object of type 'float' has no len()

So, you see those operations don't work on invidual values, they work on an iterable with a length (like a list). Another clue about your original code is that the computation of the average was inside the loop (indented), but you only want to compute the average once, not for every price.

CodePudding user response:

You need to store a list of the prices, and then compute the average after the loop which adds each prise to the list. Also, it's bad practice to name variables that conflict with built-in names like list, and in general you should have more descriptive variable names. In this case I suggest items for lists and item for list. Additionally, you don't need parentheses around the target of in.

from urllib.request import Request, urlopen
from bs4 import BeautifulSoup
import requests
from requests_html import HTMLSession

link = "https://www.ebay.co.uk/sch/i.html?_from=R40&_trksid=p2334524.m570.l1313&_nkw=5035224123933&_sacat=0&LH_TitleDesc=0&_odkw=EAN5035224123933&_osacat=0"

req = Request(link, headers={'User-Agent': 'Mozilla/5.0'})
webpage = urlopen(req).read()
with requests.Session() as c:
    soup = BeautifulSoup(webpage, 'html5lib')
    items = soup.find_all('li', class_="s-item s-item__pl-on-bottom s-item--watch-at-corner")
    prices = []
    for item in items:
        prices.append(float(item.find('span', class_="s-item__price").text.replace('£','')))
    avg = sum(price)/len(price)
  • Related