Home > database >  how to parse XML with namespace and attribute in Python?
how to parse XML with namespace and attribute in Python?

Time:09-03

hi I am trying to parse xml with namespace and attribute.

I am almost close by using root.findall() and .get()

However still struggling to get the accurate values from xml file.

How to get the xml attribute values ?

Input:

<?xml version="1.0" encoding="UTF-8"?><message:GenericData 

xmlns:message="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/message" 
xmlns:common="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/common" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xmlns:generic="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/data/generic" 
xsi:schemaLocation="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/message https://sdw-
wsrest.ecb.europa.eu:443/vocabulary/sdmx/2_1/SDMXMessage.xsd 
http://www.sdmx.org/resources/sdmxml/schemas/v2_1/common https://sdw-
wsrest.ecb.europa.eu:443/vocabulary/sdmx/2_1/SDMXCommon.xsd 
http://www.sdmx.org/resources/sdmxml/schemas/v2_1/data/generic https://sdw-
wsrest.ecb.europa.eu:443/vocabulary/sdmx/2_1/SDMXDataGeneric.xsd">

<generic:Obs>
<generic:ObsDimension value="1999-01"/>
<generic:ObsValue value="0.7029125"/>
</generic:Obs>

<generic:Obs>
<generic:ObsDimension value="1999-02"/>
<generic:ObsValue value="0.688505"/>
</generic:Obs>

Code:

import xml.etree.ElementTree as ET
tree = ET.parse("file.xml")
root = tree.getroot()
for x in root.findall('.//'):
    print(x.tag, " ", x.get('value'))

Output:

{http://www.sdmx.org/resources/sdmxml/schemas/v2_1/data/generic}Obs   None
{http://www.sdmx.org/resources/sdmxml/schemas/v2_1/data/generic}ObsDimension   1999-01
{http://www.sdmx.org/resources/sdmxml/schemas/v2_1/data/generic}ObsValue   0.7029125
{http://www.sdmx.org/resources/sdmxml/schemas/v2_1/data/generic}Obs   None
{http://www.sdmx.org/resources/sdmxml/schemas/v2_1/data/generic}ObsDimension   1999-02
{http://www.sdmx.org/resources/sdmxml/schemas/v2_1/data/generic}ObsValue   0.688505

Expected_Output:

1999-01  0.7029125

1999-02  0.688505

CodePudding user response:

How about this:

for parent in root:                                                                 
    print('  '.join([child.get('value', "") for child in parent])) 
  • Related