Home > Software design >  Python - Go through all xml files in directory, take one element with it's sub-elements and pla
Python - Go through all xml files in directory, take one element with it's sub-elements and pla

Time:09-23

I am here having a situation where I have to go through all xml files within one directory:

Get.xml
Set.xml
Try.xml
etc..

Each of them has similar structure but not the same and contains elements like this:

<example atrib1='' atrib2= ''...>
   <summary atrib1='' atrib2= ''...>
      <properties>
      </properties>
   </summary>
   <Elem>
     <element1>
       <element2>
         <subelement2>
             ........ 
         </subelement2>
       <element2>
     <element1>
   </Elem>
</example>

But then I have other let's call it Main.xml which contains Get, Set, Try as names of it's elements:

<example atrib1='' atrib2= ''...>
   <summary atrib1='' atrib2= ''...>
      <properties>
      </properties>
   </summary>
   <Test name="Get">
   </Test>
   <Test name="Set">
   </Test>
   <Test name="Try">
   </Test>
</example>

Now I need as mentioned to go through all of XML's and take element with it's subelements, and put it inside of Main.xml to the exact place matching to the name of the current XML, so final should be:

Main.xml

<example atrib1='' atrib2= ''...>
   <summary atrib1='' atrib2= ''...>
      <properties>
      </properties>
   </summary>
   <Test name="Get">
    <Elem>
       <element1>
         <element2>
           <subelement2>
               ........ 
           </subelement2>
         <element2>
       <element1>
     </Elem>
   </Test>
   <Test name="Set">
     <Elem>
       <element1>
         <element2>
           <subelement2>
               ........ 
           </subelement2>
         <element2>
       <element1>
     </Elem>
   </Test>
   <Test name="Try">
     <Elem>
       <element1>
         <element2>
           <subelement2>
               ........ 
           </subelement2>
         <element2>
       <element1>
     </Elem>
   </Test>
</example>

At the moment I have these couple of functions that are replacing two same elements in different xml files, but having hard time fixing it so I can copy whole element to the exact spot at another file:

def find_child(node, with_name):
    """Recursively find node with given name"""
    for element in list(node):
        if element.tag == 'Elem':
            return element
        elif list(element):
            sub_result = find_child(element, 'Elem')
            if sub_result is not None:
                return sub_result
    return None

def replace_node(from_tree, to_tree, node_name):
    """
    Replace node with given node_name in to_tree with
    the same-named node from the from_tree
    """
    # Find nodes of given name in each tree
    from_node = find_child(from_tree.getroot(), 'Elem')
    to_node = find_child(to_tree.getroot(), 'Test')

    # Find where to substitute the from_node into the to_tree
    to_parent, to_index = get_node_parent_info(to_tree, to_node)

    # Replace to_node with from_node
    to_parent.remove(to_node)
    to_parent.insert(to_index, from_node)

def get_node_parent_info(tree, node):
    """
    Return tuple of (parent, index) where:
        parent = node's parent within tree
        index = index of node under parent
    """
    parent_map = {c:p for p in tree.iter() for c in p}
    parent = parent_map[node]
    return parent, list(parent).index(node)

for filename in files:
    from_tree = ET.ElementTree(filename)
    to_tree = ET.ElementTree(file='Main.xml')
    
    replace_node(from_tree, to_tree, 'Elem')
    
    ET.dump(to_tree)
    to_tree.write('Main.xml')

I know this won't work because we have no two same elements that can be replaced, I need better solution, please assist!

Also I have tried something like this, just to simple copy whole element, but with no success:

source_tree = ET.parse('Get.xml')
source_root = source_tree.getroot() 
dest_tree = ET.parse('Main.xml')
dest_root = dest_tree.getroot()
for element in source_root:
    if element.tag == 'Elem':
        for delement in dest_root.iter('Test'):
            name = delement.get('name')
            if name == 'Get':
                delement.append(element)
                dest_tree.write('Main.xml', encoding='utf-8', xml_declaration=True)

I hope it is clear what has to be done here.. Please let me know if you have any ideas of how this can be done! Thanks!

CodePudding user response:

I'm not sure if this is what you want but it inserts the all Elem elements under the correct Test element.

import xml.etree.ElementTree as ET

main_tree = ET.parse('Main.xml')
for test_elem in main_tree.findall('Test'):
    tree = ET.parse(f"{test_elem.get('name')}.xml")
    for elem in tree.findall("Elem"):
        test_elem.append(elem)

with open('newmain.xml', 'wb') as f:
    main_tree.write(f)

CodePudding user response:

So I have managed to get this and it works, only problem is it is not printing xml as "pretty print":

files = os.listdir(#location)
for xml in files:
    if xml.endswith('.xml'):
        source_tree = ET.parse(xml)
        source_root = source_tree.getroot() 
        dest_tree = ET.parse('Main.xml')
        dest_root = dest_tree.getroot()
        for element in source_root:
            if element.tag == 'Elem':
                to_copy = element
                for delement in dest_root.iter('Test'):
                        name = delement.get('name')
                        if name '.xml' == xml:
                            destination_root = delement
                            destination_root.append(to_copy)
                            dest_tree.write('Main.xml', encoding='utf-8', xml_declaration=True)
  • Related