I'm attempting to find an XML element called "md:EntityDescriptor" using the following Python code:
def parse(filepath):
xmlfile = str(filepath)
doc1 = ET.parse(xmlfile)
root = doc1.getroot()
test = root.find('md:EntityDescriptor', namespaces)
print(test)
This is the beginning of my XML document, which is a SAML assertion. I've omitted the rest for readability and security, but the element I'm searching for is literally at the very beginning:
<?xml version="1.0" encoding="UTF-8"?>
<md:EntityDescriptor ...
I have a namespace defining "md" and several others:
namespaces = {'md': 'urn:oasis:names:tc:SAML:2.0:metadata'}
yet the output of print(test)
is None
.
Running ET.dump(root)
outputs the full contents of the file, so I know it isn't a problem with the input I'm passing. Running print(root.nsmap)
returns:
{'md': 'urn:oasis:names:tc:SAML:2.0:metadata'}
CodePudding user response:
If md:EntityDescriptor
is the root element, trying to find a child md:EntityDescriptor
element with find isn’t going to work. You've already selected that element as root.
However, the problem is that I need to run this same operation on multiple files, and md:EntityDescriptor is not always the root element. Is there a way to find an element regardless of whether or not it's the root?
Since you're using lxml, try using xpath() and the descendant-or-self::
axis instead of find:
test = root.xpath('descendant-or-self::md:EntityDescriptor', namespaces=namespaces)
Note that xpath()
returns a list.