Home > Software engineering >  How to remove a child of child node using python element tree in xml file
How to remove a child of child node using python element tree in xml file

Time:12-15

I am beginner in xml coding. I am currently using Python element tree for coding. My xml file looks like below

<net>
    <edge id=":1006232713_w0" function="walkingarea">
        <lane id=":1006232713_w0_0" index="0" allow="pedestrian" speed="1.00" />
        <lane id=":1006232713_w0_1" index="0" disallow="pedestrian" speed="1.00"/>      
    </edge>
    <edge id=":1006237429_0" function="internal">
        <lane id=":1006237429_0_0" index="0" allow="delivery bicycle" speed="5.69"/>
    </edge>
    <edge id=":1006237429_1" function="internal">
        <lane id=":1006237429_1_0" index="0" allow="pedestrian" speed="3.65"/>
    </edge>
    <edge id=":1006237429_w0" function="walkingarea">
        <lane id=":1006237429_w0_0" index="0" allow="pedestrian" speed="1.00"/>
        <lane id=":1006237429_w0_0" index="0" disallow="pedestrian" speed="5.50"/>
    </edge>
    <edge id=":1006249156_w0" function="walkingarea">
        <lane id=":1006249156_w0_0" index="0" allow="pedestrian" speed="1.00"/>
    </edge>
    <edge id=":1006249161_w0" function="walkingarea">
        <lane id=":1006249161_w0_0" index="0" disallow="pedestrian" speed="1.00"/>
    </edge>
        
</net>

Here in the xml, there are child elements "edge" and child of edge is "lane" Requirement: I want to keep the "lane" that has the attribute allow="pedestrian" and delete the other lane. If the lane under the edge has no allow="pedestrian" attribute then I want to delete the corresponding edge and lane

Desired output

<net>

    <edge id=":1006232713_w0" function="walkingarea">
        <lane id=":1006232713_w0_0" index="0" allow="pedestrian" speed="1.00" />        
    </edge>

    <edge id=":1006237429_w0" function="walkingarea">
        <lane id=":1006237429_w0_0" index="0" allow="pedestrian" speed="1.00"/>
    </edge>
    <edge id=":1006249156_w0" function="walkingarea">
        <lane id=":1006249156_w0_0" index="0" allow="pedestrian" speed="1.00"/>
    </edge>
    
</net>

I tried to find the lane id that has the attribute allow="pedestrian" using the below coding

for edge in root.findall("./edge/lane/[@allow= 'pedestrian']..."):
    for lane in edge.find("./lane/[@allow= 'pedestrian']..."):
        print(lane.attrib['id'])

This prints out the edge id correctly, but prints out both the lane id under the edge. I want to pick up only the lane that has the attribute allow="pedestrian" under the edge and delete the other lane. If the lane under the edge has no allow="pedestrian" attribute then I want to delete the corresponding edge and lane It would be really helpful if anyone could address the issue.

CodePudding user response:

I would lxml instead of ElementTree, because of it's better xpath support.

Note: the answer assumes the actual file structure is exactly the same as in your question. If not, the answer may not work as is.

So try something like this (though in your case, you'll have to parse a file, not load from a string like below):

from lxml import etree
nets = """[your xml above]"""
doc = etree.fromstring(nets)

for edge in doc.xpath('//edge'):
    target = edge.xpath('.//lane[@allow="pedestrian"]')
    if target:
        to_del=(target[0].xpath('following-sibling::lane'))
        if to_del:
            to_del[0].getparent().remove(to_del[0])
    else:
        edge.getparent().remove(edge)
print(etree.tostring(doc).decode())

Output should be your expected output.

  • Related