I want to extract information from a website, however I can not access the information I want to because the html-code is formatted in a way that doesn't allow me to access the information. In the html Code below, I would like to extract the mtl. You can see that the after the class=, the '3D "closes" before the whole class name is finished. I tried every possible version to access the mtl., but its not possible.
<div class='3D"ServiceOffer_badge__kriSF"'>
<div>
<p ce__offering___1cjqq="" class='3D"Pri=' inline;"="" price__brand___2kedu="" price__large="___35JMV" price__price___38oh2="" price__price___38oh2"="" style='3D"display:'>
<span class='3D"Pr=' ice__value___wawnq"="">
0 =E2=82=AC
</span>
<span class='3D"Price__suffix___1=' d8-m"="">
mtl.
</span>
Do you have any idea how to do this? Thank you so much in advance!
CodePudding user response:
from bs4 import BeautifulSoup
html = '''<div class='3D"ServiceOffer_badge__kriSF"'>
<div>
<p ce__offering___1cjqq="" class='3D"Pri=' inline;"="" price__brand___2kedu="" price__large="___35JMV" price__price___38oh2="" price__price___38oh2"="" style='3D"display:'>
<span class='3D"Pr=' ice__value___wawnq"="">
0 =E2=82=AC
</span>
<span class='3D"Price__suffix___1=' d8-m"="">
mtl.
</span>'''
soup = BeautifulSoup(html, 'html.parser')
spanStr = soup.find('span', {'class':'3D"Price__suffix___1='}).text.strip()
Output:
print(spanStr)
mtl.