Home > Back-end >  How can I extract an int?
How can I extract an int?

Time:09-21

I'm new on stack overflow, I'm writing a script in python and I've got a doubt that I can resolve, I need to create a variable with the price of the product, by now I've collected decimal price in €, thanks to web scraping.

import bs4, requests
 
link = "https://hookpod.shop/products/hookpod-screw-adapter"
 
response = requests.get(link)
response.raise_for_status()
soup = bs4.BeatifulSoup(response.text, 'html.parster')
span_price = soup.find('span', class_='product__price')

what output gives to me is:

<span class="product__price" data-product-price=""> €10.00 </span>

I need to get the amount (€10.00) and transform it in a int, is there anybody who can help me with, I really need it

CodePudding user response:

converting span_price text to int will solve it.

something like:

var int_span_price = int(span_price.text.replace('€', ''))

CodePudding user response:

The find method return a Tag object and you can access to its string via the text attribute. Then you should remove the empty space around it with strip, and the money-symbol, with a slice for example. The cast to float and finally with int.

from bs4 import BeautifulSoup

html = '<span class="product__price" data-product-price=""> €10.00 </span>'

span_price = BeautifulSoup(html,'lxml') # you can change parser

span_price_value = int(float(span_price.text.strip()[1:]))

print(span_price_value)

Remark:

  1. I used another parser bit make no difference just be sure to change it if you haven't install it (lxml)
  2. if don't use strip then you should be careful with the slice of the string, not more at 1

CodePudding user response:

I recommend you to use https://pypi.org/project/price-parser/

To install it run pip install price-parser

>>> from price_parser import Price
>>> price = Price.fromstring("22,90 €")
>>> price
Price(amount=Decimal('22.90'), currency='€')
>>> price.amount       # numeric price amount
Decimal('22.90')
>>> price.currency     # currency symbol, as appears in the string
'€'
>>> price.amount_text  # price amount, as appears in the string
'22,90'
>>> price.amount_float # price amount as float, not Decimal
22.9

CodePudding user response:

use Beautiful Soup's tag system to lock on that data and soup.getText() to pull it out. You could also parse the results of the soup.find method you did there

CodePudding user response:

There was a couple of typos so I am writing the full code. Use regex to get the digits out of the Euro prices you got already.

import bs4, requests
from bs4 import BeautifulSoup

link = "https://hookpod.shop/products/hookpod-screw-adapter"

response = requests.get(link)
response.raise_for_status()
soup = bs4.BeautifulSoup(response.text, 'html.parser')
span_price = soup.find('span', class_='product__price')

import re
result = re.search(r'\d ', span_price.text)
result_int = int(result.group())
result_int
  • Related