Home > OS >  python xml filter based on multiple conditions on different nodes
python xml filter based on multiple conditions on different nodes

Time:08-04

I have the following xml data:

<?xml version="1.0"?>
<Company>
  <Employee1>
      <FirstName>Tanmay</FirstName>
      <LastName>Patil</LastName>
      <ContactNo>1234567890</ContactNo>
      <Email>[email protected]</Email>
      <Address>
           <City>Bangalore</City>
      </Address>
      <name> XXXXX</name>
  </Employee1>
    <Employee2>
      <FirstName>Tanmay</FirstName>
      <LastName>Patil</LastName>
      <ContactNo>1234567890</ContactNo>
      <Email>[email protected]</Email>
      <Address>
           <City>Chennai</City>
      </Address>
      <name> YYYYYY</name>
  </Employee2>
    <Employee3>
      <FirstName>Tanmay</FirstName>
      <LastName>Patil</LastName>
      <ContactNo>1234567890</ContactNo>
      <Email>[email protected]</Email>
      <Address>
           <City>Bangalore</City>
      </Address>
      <name> ZZZZZ</name>
  </Employee3>
</Company>

I want to filter based on, City = Bangalore and get relevant contents of name tags for each.

The desired output when filtered City = Bangalore:

        <name> XXXXX</name>
        <name> ZZZZZ</name>

I have tried using the below and nothing helped me:

import xml.etree.ElementTree as ET
tree = ET.parse('file.xml')
tree.findall('city=Bangalore').name

But did not get what I am trying. Can someone help please?

CodePudding user response:

Try to use XPath:

'//Employee[Address/City="Bangalore"]/name'

CodePudding user response:

I had a similar problem like you but used minidom. I wrote two small functions to solve it. I wrote a some Code based on your original xml. This gives a list with all nodes containing Bangalore.

from xml.dom import minidom

def findChild(parent, childLocalName):
    for child in parent._get_childNodes():
        if child._get_localName() == childLocalName:
            return True, child
    return False, child

def followXMLPath(parent, path):# path should be a list of node names as string
    if path != None:
        for localName in path:
            result = findChild(parent, localName)
            if result[0] == True:
                parent = result[1]
            else:
                return result[0], parent
        return result
    else:
        return "error", parent

if __name__ == "__main__":
    xml = "C:\\Users\\AJ2MSGR\\Downloads\\bangalor.txt"
    xmldoc = minidom.parse(xml)
    finds = xmldoc.getElementsByTagName('Employee')

    bangaloreEmployees = []
    for element in finds:
        searchResult = followXMLPath(element,["Address", "City"]) #gives back a tuple
        success = searchResult[0]
        cityNode = searchResult[1]
        if success :
            if cityNode.firstChild.data == "Bangalore":
                bangaloreEmployees.append(element)
        else:
            print("nope")

    for element in bangaloreEmployees:
        print("::")
        print(findChild(element,"ContactNo")[1].firstChild.data)
        print(findChild(element,"name")[1].firstChild.data)
  • Related