The sample html content:
<div > Hello World It is good to see you.<span>hi<img/></span></div>
I want to print only Hello World It is good to see you (should not include hi and img) but when I try methods like .text in BeautifulSoup, it is also scraping the text from the inner tags can someone help me out?
CodePudding user response:
Consider:
my_html = '<div > Hello World It is good to see you.<span>hi<img/></span></div>'
soup = BeautifulSoup(my_html)
div_tag = soup.find("div")
The following line will achieve it:
text_content = div_tag.find_all(text=True, recursive=False)
Hope it helps.