H1 is unique
<h1>Anno <span>(2021)</span></h1>
I need to extact text from H1 SPAN no class with Beautifulsoup and remove brackets
CodePudding user response:
Here is the working solution:
from bs4 import BeautifulSoup
tag="""
<h1>
Anno
<span>
(2021)
</span>
</h1>
"""
soup = BeautifulSoup(tag, 'html.parser')
span= soup.select_one('h1 span').text.replace('(', '').replace(')', '')
print(span)
Output
2021