I have some XML where each element has slightly different attributes. I only want to pull the element with the population attribute. The below code works. However, if I uncomment the population assignment and print at the bottom, it fails because the population attribute is not in the first two elements. How do I only select the element with that specific attribute? It's the only one I need anyway.
from xml.etree import ElementTree as ET
xml = '''<?xml version="1.0"?>
<data>
<country name="Liechtenstein">
<rank>1</rank>
<year>2008</year>
<gdppc>141100</gdppc>
<neighbor name="Austria" direction="E"/>
<neighbor name="Switzerland" direction="W"/>
</country>
<country name="Singapore">
<rank>4</rank>
<year>2011</year>
<gdppc>59900</gdppc>
<neighbor name="Malaysia" direction="N"/>
</country>
<country name="Panama">
<rank>68</rank>
<year>2011</year>
<gdppc>13600</gdppc>
<neighbor name="Costa Rica" direction="W"/>
<neighbor name="Colombia" direction="E" population="500"/>
</country>
</data>'''
root = ET.fromstring(xml)
print(root.tag)
print(root.findall('.//data/country'))
for target in root.findall('.//country'):
name = target.attrib['name']
#population = target.attrib['population']
print(name)
#print(population)
CodePudding user response:
In the code you are looking for country
elements with a population
attribute, but there are no such elements.
To get any element with a population
attribute, you can use a wildcard for the element name:
for target in root.findall('.//*[@population]'):
print(target.tag, target.attrib)
Output:
neighbor {'name': 'Colombia', 'direction': 'E', 'population': '500'}