I have my html content as:
html = <div>new notes</div><div><ol><li>kssd</li></ol><ul><li>cds</li><li>dsdsk</li></ul><font color=\"#66717b\">ndsmnd</font></div>
When I convert the above expression to string, it throws error
html_str = str(html)
I can see the " are already escaped here. do I need to replace /" with //" and then convert to string?
CodePudding user response:
I think you need to use get_text()
from bs4 import BeautifulSoup
htmlvar = BeautifulSoup(html)
print(htmlvar.get_text())
CodePudding user response:
you can try this:
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
print(soup.prettify())
tag = soup.html
string = str(tag)
print(string)