Home > Back-end >  while using python web-scraping faced error
while using python web-scraping faced error

Time:01-21

I want to compare the price of coconut on two websites. there are two stores (websites) called laughs and glomark.

Now,I have two files main.py and comparison.py. I think the problem is in the Laughs price scrapping part. This cord is running without error. I will put my output and expected output bellow after the code.

main.py

from compare_prices import compare_prices 
laughs_coconut = 'https://scrape-sm1.github.io/site1/COCONUT market1super.html'
glomark_coconut = 'https://glomark.lk/coconut/p/11624'
compare_prices(laughs_coconut,glomark_coconut) 

comparison.py

import requests
import json
from bs4 import BeautifulSoup

#Imitate the Mozilla browser.
user_agent = {'User-agent': 'Mozilla/5.0'}

def compare_prices(laughs_coconut,glomark_coconut):
    # Aquire the web pages which contain product Price
    laughs_coconut = requests.get(laughs_coconut)
    glomark_coconut = requests.get(glomark_coconut)

    # LaughsSuper supermarket website provides the price in a span text.
    soup_laughs = BeautifulSoup(laughs_coconut.text, 'html.parser')
    price_laughs = soup_laughs.find('span',{'class': 'price'}).text
    
    
    # Glomark supermarket website provides the data in jason format in an inline script.
    soup_glomark = BeautifulSoup(glomark_coconut.text, 'html.parser')
    script_glomark = soup_glomark.find('script', {'type': 'application/ld json'}).text
    data_glomark = json.loads(script_glomark)
    price_glomark = data_glomark['offers'][0]['price']

    
    #TODO: Parse the values as floats, and print them.
    price_laughs = price_laughs.replace("Rs.","")
    price_laughs = float(price_laughs)
    price_glomark = float(price_glomark)
    print('Laughs   COCONUT - Item#mr-2058 Rs.: ', price_laughs)
    print('Glomark  Coconut Rs.: ', price_glomark)
    
    # Compare the prices and print the result
    if price_laughs > price_glomark:
        print('Glomark is cheaper Rs.:', price_laughs - price_glomark)
    elif price_laughs < price_glomark:
        print('Laughs is cheaper Rs.:', price_glomark - price_laughs)    
    else:
        print('Price is the same')

My code is running without error and as an output, it shows.

Laughs   COCONUT - Item#mr-2058 Rs.:  0.0

Glomark  Coconut Rs.:  110.0

Laughs is cheaper Rs.: 110.0

but the expected output is:

Laughs   COCONUT - Item#mr-2058 Rs.:  95.0

Glomark  Coconut Rs.:  110.0

Laughs is cheaper Rs.: 15.0

note:- <span >Rs.95.00</span> this is the element of Laughs coconut price

CodePudding user response:

Because there are two items with 'span',{'class': 'price'} . Since find() method returns first value, in this case we will use findAll() method and return second one. So in your code if you change to this price_laughs = soup_laughs.findAll('span',{'class': 'price'})[1].text problem will be solved.

CodePudding user response:

Try to change your strategy selecting the element - There is an id to select elements container more specific. You could use css selectors for example

price_laughs = soup.select_one('[id^="product-price"] .price').text

Concerning the other website you could also use its api to get the price:

 requests.get('https://glomark.lk/product-page/variation-detail/11624', headers={'x-requested-with': 'XMLHttpRequest'}).json()['price']
  • Related