Home > OS >  I need to extact text from H1 SPAN no class with Beautifulsoup and remove brackets
I need to extact text from H1 SPAN no class with Beautifulsoup and remove brackets

Time:10-01

H1 is unique

<h1>Anno <span>(2021)</span></h1>

I need to extact text from H1 SPAN no class with Beautifulsoup and remove brackets

CodePudding user response:

Here is the working solution:

from bs4 import BeautifulSoup

tag="""
<h1>    
 Anno   
 <span> 
  (2021)
 </span>
</h1>

"""

soup = BeautifulSoup(tag, 'html.parser')
span= soup.select_one('h1 span').text.replace('(', '').replace(')', '')
print(span)

Output

2021
  • Related