I am trying to get the value '5' of from the following :
<div >₹<!-- -->5</div>
and the python-beautifulsoup code i used returns Nothing:
drugprice=soup.find('div', class_="DrugPriceBox__price___dj2lv")
print(drugprice)
The webpage url is: https://www.1mg.com/drugs/acticort-5mg-tablet-321932
Thank you in advance!
Additional information:
The WORKING CODE after solving the problem:
if __name__ == '__main__':
#turl='https://www.1mg.com/drugs/acticort-5mg-tablet-321932'
turl='https://www.1mg.com/drugs/zerodol-sp-tablet-67307'
print (turl)
soup = BeautifulSoup(requests.get(turl,headers=headers).content, "html.parser")
#type 'div' pricing format
div = soup.find('div', class_='DrugPriceBox__price___dj2lv')
if div:
print(div.text)
else:
#type 'span' pricing format
#span =soup.find('span', class_="PriceBoxPlanOption__offer-price___3v9x8 PriceBoxPlanOption__offer-price-cp___2QPU_")
span =soup.find('span', class_="PriceBoxPlanOption__margin-right-4___2aqFt PriceBoxPlanOption__stike___pDQVN")
if span:
print(span.text)
else:
print('Nada')
CodePudding user response:
Not sure why you are getting an error. I cannot see the original site as it is blocked for some reason, but running an express server with exactly the div that you entered, and using the below worked fine for me with the below.
import string
import bs4
import requests
if __name__ == '__main__':
r = requests.get('http://localhost:3000/')
soup = bs4.BeautifulSoup(r.text)
div = soup.find('div', class_='DrugPriceBox__price___dj2lv')
acceptable_chars = set(string.ascii_letters string.digits '.')
drugprice = ''.join(char for char in div.text if char in acceptable_chars)
print(drugprice)