I am parsing an XML file and trying to delete a empty node but I am receiving the following error:
ValueError: list.remove(x): x not in lis
The XML file is as follows:
<toc>
<topic filename="GUID-5B8DE7B7-879F-45A4-88E0-732155904029.xml" docid="GUID-5B8DE7B7-879F-45A4-88E0-732155904029" TopicTitle="Notes, cautions, and warnings" />
<topic filename="GUID-89943A8D-00D3-4263-9306-CDC944609F2B.xml" docid="GUID-89943A8D-00D3-4263-9306-CDC944609F2B" TopicTitle="HCI Deployment with Windows Server">
<childTopics>
<topic filename="GUID-A3E5EA96-2110-46FF-9251-2291DF755F50.xml" docid="GUID-A3E5EA96-2110-46FF-9251-2291DF755F50" TopicTitle="Installing the OMIMSWAC license" />
<topic filename="GUID-7C4D616D-0D9A-4AE1-BE0F-EC6FC9DAC87E.xml" docid="GUID-7C4D616D-0D9A-4AE1-BE0F-EC6FC9DAC87E" TopicTitle="Managing Microsoft HCI-based clusters">
<childTopics>
</childTopics>
</topic>
</childTopics>
</topic>
</toc>
Kindly note that this is just an example format of my XML File. I this file, I want to remove the empty tag but I am getting an error. My current code is:
import xml.etree.ElementTree as ET
tree = ET.parse("toc2 - Copy.xml")
root = tree.getroot()
node_to_remove = root.findall('.//childTopics//childTopics')
for node in node_to_remove:
root.remove(node)
CodePudding user response:
You need to call remove
on the node's immediate parent, not on root
. This is tricky using xml.etree
, but if instead you use lxml.etree
you can write:
import lxml.etree as ET
tree = ET.parse("data.xml")
root = tree.getroot()
node_to_remove = root.findall('.//childTopics//childTopics')
for node in node_to_remove:
node.getparent().remove(node)
print(ET.tostring(tree).decode())
Nodes in xml.etree
do not have a getparent()
method. If you're unable to use lxml
, you'll need to look into other solutions for finding the parent of a node; this question has some discussion on that topic.