I am new to using BeautifulSoup
.
I have a line in an HTML file that is stored locally.
<LINK rel="stylesheet" type="text/css" href="report.css" >
I wish to remove that line, but I don't know what approach to use to find the line and remove it.
I can find the line using: old_text = soup.find("link", {"href": "report.css"})
But I can't work out how to remove and save the file again?
CodePudding user response:
You could use .decompose()
to get rid of the tag:
soup.find("link", {"href": "report.css"}).decompose()
or
soup.select_one('link[href^="report."]').decompose()
and convert BeautifulSoup
object back to string and save it:
str(soup)
Example
from bs4 import BeautifulSoup
html = '''
<some tag>some content</some tag>
<LINK rel="stylesheet" type="text/css" href="report.css" >
<some tag>some content</some tag>
'''
soup = BeautifulSoup(html, "html.parser")
soup.select_one('link[href^="report."]').decompose()
print(str(soup))