Home > Back-end >  Get children elements of multiple instances of the same name tag using ElementTree
Get children elements of multiple instances of the same name tag using ElementTree

Time:07-16

I have an xml file looking like this:

<?xml version="1.0" encoding="UTF-8"?>
<data>
<boundary_conditions>
  <rot>
    <rot_instance>
      <name>BC_1</name>
      <rpm>200</rpm>
      <parts>
        <name>rim_FL</name>
        <name>tire_FL</name>
        <name>disk_FL</name>
        <name>center_FL</name>
      </parts>
    </rot_instance>
    <rot_instance>
      <name>BC_2</name>
      <rpm>100</rpm>
      <parts>
        <name>tire_FR</name>
        <name>disk_FR</name>
      </parts>
    </rot_instance>
</data>

I actually know how to extract data corresponding to each instance. So I can do this for the names tag as follows:

import xml.etree.ElementTree as ET
tree = ET.parse('file.xml')
root = tree.getroot()
names= tree.findall('.//boundary_conditions/rot/rot_instance/name')
for val in names:
    print(val.text)

which gives me:

BC_1
BC_2

But if I do the same thing for the parts tag:

names= tree.findall('.//boundary_conditions/rot/rot_instance/parts/name')
for val in names:
    print(val.text)

It will give me:

rim_FL
tire_FL
disk_FL
center_FL
tire_FR
disk_FR

Which combines all data corresponding to parts/name together. I want output that gives me the 'parts' sub-element for each instance as separate lists. So this is what I want to get:

instance_BC_1 = ['rim_FL', 'tire_FL', 'disk_FL', 'center_FL']
instance_BC_2 = ['tire_FR', 'disk_FR']

Any help is appreciated, Thanks.

CodePudding user response:

You've got to first find all parts elements, then from each parts element find all name tags.

Take a look:

parts = tree.findall('.//boundary_conditions/rot/rot_instance/parts')
for part in parts:
    for val in part.findall("name"):
        print(val.text)
    
    print()


instance_BC_1 = [val.text for val in parts[0].findall("name")]
instance_BC_2 = [val.text for val in parts[1].findall("name")]

print(instance_BC_1)
print(instance_BC_2)

Output:

rim_FL
tire_FL
disk_FL
center_FL

tire_FR
disk_FR

['rim_FL', 'tire_FL', 'disk_FL', 'center_FL']
['tire_FR', 'disk_FR']
  • Related