Home > OS >  How to scrape and construct price that is in the same div but different sub classes?
How to scrape and construct price that is in the same div but different sub classes?

Time:03-03

I am trying to construct a price out of two different tags (see picture bellow). How do you nest the search so it looks in the div , for span, and sub tags?

enter image description here

How do I get out the numbers from span and sub tags in div

I have tried the following:

coldbeetrootsoup=BeautifulSoup(f,'html.parser')

try:
    price = coldbeetrootsoup.find("span",{"class": None}).text.replace('\n',"")
except:
    price = None

try:
    subprice = coldbeetrootsoup.find("sub",{"class": None}).text.replace('\n',"")
except:
    subprice = None

link: https://www.rimi.lt/e-parduotuve/lt/produktai/vaisiai-darzoves-ir-geles/vaisiai-ir-uogos/obuoliai-/fas-liet-obuoliai-ligol-nuraude-anyks-vnt/p/923923

target: price 1.39 EUR

CodePudding user response:

The thing you want is

import re

import requests
from bs4 import BeautifulSoup

html = requests.get(
    "https://www.rimi.lt/e-parduotuve/lt/produktai/vaisiai-darzoves-ir-geles/vaisiai-ir-uogos/obuoliai-/fas-liet-obuoliai-ligol-nuraude-anyks-vnt/p/923923").text


soup = BeautifulSoup(html, features="html.parser")

price_div = soup.find("div", {"class": "price"})
full_part = price_div.find("span").text
cents_part = price_div.find("sup").text
currency = price_div.find("sub").text

currency = re.sub("\s ", "", currency)

print(f"{full_part}.{cents_part} {currency}")  # 1.39 €/vnt.

CodePudding user response:

You can start by defining the div you want to dig into, in this case it is the div with the class 'price', you do this just like you already are trying to do with the spans:

price = soup.find('div', {'class' : 'price'})

once we have this price, we can instead of searching the whole html, just search for the wanted tags within this div, as such:

euro = price.find('span')
cent = price.find('sup')

now to get what you want, you can do:

print(f"{euro.text}.{cent.text}")

or if you want the float variable

price_tag = round((int(euro.text)   (int(cent.text)/100)),2)

Here we divide the cent with 100 or else it wouldn't be in cents, and we use the round tag to only get two decimal points.

  • Related