I have the following XML file:
<customer>
<id>807997287</id>
<dateCreated>2022-11-13T00:00:00Z</dateCreated>
<status>Created</status>
<client>
<id>807997223</id>
<firstname>Jeff</firstname>
<lastname>Smith</lastname>
<address>
<id>4388574</id>
<home>
<addressLine1>Address Line 1</addressLine1>
<addressLine2>Address Line 2</addressLine2>
<addressLine3>Address Line 3</addressLine3>
<addressLine4>Address Line 4</addressLine4>
<postCode>XXX ZZZ</postCode>
</home>
<telephoneNumbers>
<telephone>
<id>807997230</id>
<areaCode>01123</areaCode>
<phoneNumber>123123</phoneNumber>
<usage>Work</usage>
</telephone>
<telephone>
<id>807997232</id>
<areaCode>01564</areaCode>
<phoneNumber>123123</phoneNumber>
<usage>Home</usage>
</telephone>
</telephoneNumbers>
</address>
</client>
</customer>
And I need to be able to remove all the ID nodes.
I have tried the following code, but it doesn't A) find all the IDs B) doesn't remove them
import xml.etree.ElementTree as ET
tree = ET.ElementTree()
tree.parse('test.xml')
root = tree.getroot()
ids = root.findall(".//id")
for item in ids:
ids.remove(item)
print(ET.tostring(item))
t = ET.ElementTree(root)
t.write("output.xml")
The commandline output is:
b'<id>807997287</id>\n '
b'<id>4388574</id>\n '
b'<id>807997232</id>\n '
And the output.xml remains the same.
Can anyone help point me in the right direction with this one please?
CodePudding user response:
You are probably looking for something like
##for elem in root.findall('.//*[id]'):
EDIT
for elem in root.findall('.//id/..'):
id = elem.find('.//id')
elem.remove(id)
print(ET.tostring(root).decode())
Output should be your expected output.