How do I print the string of <h> tag that has multiple <strong>?-CodePudding

firstHeader = mclarenHTML.find_all(re.compile('^h[2]'))[0] #finding header titles
print(firstHeader)

Output

<h2><strong><strong>1950-1953: </strong>Formula 1 begins: the super-charger years</strong></h2>

How do i get the string "1950-1953:Formula 1 begins: the super-charger years"?

Tried using .string but it returns none

CodePudding user response：

Use .text:

from bs4 import BeautifulSoup

soup = BeautifulSoup(
    "<h2><strong><strong>1950-1953: </strong>Formula 1 begins: the super-charger years</strong></h2>",
    "html.parser",
)

header = soup.h2

print(header.text)

Prints:

1950-1953: Formula 1 begins: the super-charger years

Or use .get_text() - you can use then strip= and separator= parameters:

print(header.get_text(strip=True, separator=" "))

Prints:

1950-1953: Formula 1 begins: the super-charger years