Home > Software design >  Python LXML fails to find XML element
Python LXML fails to find XML element

Time:10-28

I'm attempting to find an XML element called "md:EntityDescriptor" using the following Python code:

def parse(filepath):
    xmlfile = str(filepath)
    doc1 = ET.parse(xmlfile)
    root = doc1.getroot()
    test = root.find('md:EntityDescriptor', namespaces)
    print(test)

This is the beginning of my XML document, which is a SAML assertion. I've omitted the rest for readability and security, but the element I'm searching for is literally at the very beginning:

<?xml version="1.0" encoding="UTF-8"?>
<md:EntityDescriptor ...

I have a namespace defining "md" and several others:

namespaces = {'md': 'urn:oasis:names:tc:SAML:2.0:metadata'}

yet the output of print(test) is None.

Running ET.dump(root) outputs the full contents of the file, so I know it isn't a problem with the input I'm passing. Running print(root.nsmap) returns:

{'md': 'urn:oasis:names:tc:SAML:2.0:metadata'}

CodePudding user response:

If md:EntityDescriptor is the root element, trying to find a child md:EntityDescriptor element with find isn’t going to work. You've already selected that element as root.

However, the problem is that I need to run this same operation on multiple files, and md:EntityDescriptor is not always the root element. Is there a way to find an element regardless of whether or not it's the root?

Since you're using lxml, try using xpath() and the descendant-or-self:: axis instead of find:

test = root.xpath('descendant-or-self::md:EntityDescriptor', namespaces=namespaces)

Note that xpath() returns a list.

  • Related