Home > database >  How to parse XML from string in python
How to parse XML from string in python

Time:10-06

I'm trying to parse an XML from a string in Python with no success. The string I'm trying to parse is:

<?xml version="1.0" encoding="UTF-8"?>
<rpc-reply xmlns="urn:ietf:params:xml:ns:netconf:base:1.0" xmlns:nc="urn:ietf:params:xml:ns:netconf:base:1.0" message-id="urn:uuid:573a453c-72c0-4185-8c54-9010593dd102">
   <data>
      <config xmlns="http://www.calix.com/ns/exa/base">
         <profile>
            <policy-map>
               <name>ELINE_PM_1</name>
               <class-map-ethernet>
                  <name>Eth-match-any-1</name>
                  <ingress>
                     <meter-type>meter-mef</meter-type>
                     <eir>1000000</eir>
                  </ingress>
               </class-map-ethernet>
            </policy-map>
            <policy-map>
               <name>ELINE_PM_2</name>
               <class-map-ethernet>
                  <name>Eth-match-any-2</name>
                  <ingress>
                     <meter-type>meter-mef</meter-type>
                     <eir>10000000</eir>
                  </ingress>
               </class-map-ethernet>
            </policy-map>
         </profile>
      </config>
   </data>
</rpc-reply>

I'm trying to use xml.etree.ElementTree library to parse the xml and I also tried to remove the first line related to xml version and encoding with no results. The code snippet to reproduce the issue I'm facing is:

import xml.etree.ElementTree as ET

reply_xml='''
<data>
   <config>
      <profile>
         <policy-map>
            <name>ELINE_PM_1</name>
            <class-map-ethernet>
               <name>Eth-match-any-1</name>
               <ingress>
                  <meter-type>meter-mef</meter-type>
                  <eir>1000000</eir>
               </ingress>
            </class-map-ethernet>
         </policy-map>
         <policy-map>
            <name>ELINE_PM_2</name>
            <class-map-ethernet>
               <name>Eth-match-any-2</name>
               <ingress>
                  <meter-type>meter-mef</meter-type>
                  <eir>10000000</eir>
               </ingress>
            </class-map-ethernet>
         </policy-map>
      </profile>
   </config>
</data>
'''

root = ET.fromstring(reply_xml)
for child in root:
    print(child.tag, child.attrib)

reply_xml is a string containing the above mentioned xml so it should work but if I inspect the root variable using the debugger I see that it is not being populated correctly. It seems that the first xml tag (<?xml version="1.0" encoding="UTF-8"?>) creates some problems but even if I manually remove it I am not able to parse the xml correctly.

Any clue to parse that xml?

CodePudding user response:

Your original XML has namespaces. You need to honor them in your XPath queries.

import xml.etree.ElementTree as ET

reply_xml '''<?xml version="1.0" encoding="UTF-8"?>
<rpc-reply xmlns="urn:ietf:params:xml:ns:netconf:base:1.0" xmlns:nc="urn:ietf:params:xml:ns:netconf:base:1.0" message-id="urn:uuid:573a453c-72c0-4185-8c54-9010593dd102">
   <data>
      <config xmlns="http://www.calix.com/ns/exa/base">
        <!-- ... the rest of it ... -->
      </config>
   </data>
</rpc-reply>'''

ns = {
    'calix': 'http://www.calix.com/ns/exa/base'
}

root = ET.fromstring(reply_xml)
for eir in root.findall('.//calix:eir', ns):
    print(eir.text)

prints

1000000
10000000

CodePudding user response:

Your code works fine. It shows all children of the root element, which is only <config> .. </config> and it has no attributes.

To get to the <eir> tag, you should use XPath, or go through the tree recursively.

Quick solution for XPath:

root.findall('.//eir')

CodePudding user response:

see below (1 liner with xpath)

import xml.etree.ElementTree as ET

reply_xml='''
<data>
   <config>
      <profile>
         <policy-map>
            <name>ELINE_PM_1</name>
            <class-map-ethernet>
               <name>Eth-match-any-1</name>
               <ingress>
                  <meter-type>meter-mef</meter-type>
                  <eir>1000000</eir>
               </ingress>
            </class-map-ethernet>
         </policy-map>
         <policy-map>
            <name>ELINE_PM_2</name>
            <class-map-ethernet>
               <name>Eth-match-any-2</name>
               <ingress>
                  <meter-type>meter-mef</meter-type>
                  <eir>20000000</eir>
               </ingress>
            </class-map-ethernet>
         </policy-map>
      </profile>
   </config>
</data>
'''

root = ET.fromstring(reply_xml)
eirs = [e.text for e in root.findall('.//eir')]
print(eirs)

output

['1000000', '20000000']
  • Related