Home > OS >  I have an outer div and a img tag inside this div. Now I want the content of this div without includ
I have an outer div and a img tag inside this div. Now I want the content of this div without includ

Time:08-22

The sample html content:

<div > Hello World It is good to see you.<span>hi<img/></span></div>

I want to print only Hello World It is good to see you (should not include hi and img) but when I try methods like .text in BeautifulSoup, it is also scraping the text from the inner tags can someone help me out?

CodePudding user response:

Consider:

my_html = '<div > Hello World It is good to see you.<span>hi<img/></span></div>'
soup = BeautifulSoup(my_html)
div_tag = soup.find("div")

The following line will achieve it:

text_content = div_tag.find_all(text=True, recursive=False)

Hope it helps.

  • Related