Home > OS >  How to get tag contents (including all text and elements)
How to get tag contents (including all text and elements)

Time:10-02

I have a html snippet (no other parent elements):

html = '<div id="mydiv"><p>Hello</p><p>Goodbye</p>[...]</div>'

How do I extract all the tags and text (which may be variable) within the div, but not the div tag itself? I.e.L

target_str = '<p>Hello</p><p>Goodbye</p>[...]'

I have tried:

soup = BeautifulSoup(html , 'html.parser')
mydiv = soup.find(id='mydiv')
print(mydiv)
>>> '<div id="mydiv"><p>Hello</p><p>Goodbye</p>[...]</div>'

mydiv.unwrap()
print(mydiv)
>>> '<div id="mydiv"></div>'

How do I get just the contents of the tag?

CodePudding user response:

Try:

from bs4 import BeautifulSoup

html = '<div id="mydiv"><p>Hello</p><p>Goodbye</p>[...]</div>'
soup = BeautifulSoup(html, "html.parser")

print("".join(map(str, soup.select_one("#mydiv").contents)))

Prints:

<p>Hello</p><p>Goodbye</p>[...]
  • Related