firstHeader = mclarenHTML.find_all(re.compile('^h[2]'))[0] #finding header titles
print(firstHeader)
Output
<h2><strong><strong>1950-1953: </strong>Formula 1 begins: the super-charger years</strong></h2>
How do i get the string "1950-1953:Formula 1 begins: the super-charger years"?
Tried using .string
but it returns none
CodePudding user response:
Use .text
:
from bs4 import BeautifulSoup
soup = BeautifulSoup(
"<h2><strong><strong>1950-1953: </strong>Formula 1 begins: the super-charger years</strong></h2>",
"html.parser",
)
header = soup.h2
print(header.text)
Prints:
1950-1953: Formula 1 begins: the super-charger years
Or use .get_text()
- you can use then strip=
and separator=
parameters:
print(header.get_text(strip=True, separator=" "))
Prints:
1950-1953: Formula 1 begins: the super-charger years