Home > Enterprise >  How to modify xml to insert elements in Python?
How to modify xml to insert elements in Python?

Time:07-29

I have an XML in the following form

<PROJECT>
    <UPDATE_TYPE>FULL</UPDATE_TYPE>
    <PROJECT_NAME>GEN20x_BALBOA</PROJECT_NAME>
    <AAA>000</AAA>
    <BBB>CIVIC</BBB>
    <CCC>ECE</CCC>
    <BLOCK>
        <BLOCK1>
            <TYPE>BOOT</TYPE>
            <TYPE>BOOT</TYPE>
            <TASK>
                <VERSION>0.1</VERSION>
                <FILE>           
                     <INSTALL_METHOD INSTALL="first" />
                     <INSTALL_OPTIONS softwareType="aaa" />
                     <INSTALL_OPTIONS softwareType="qqq" />
               <FILE> 
            </TASK>
            <TASK>
                <VERSION>0.1</VERSION>
                <FILE>
                    <INSTALL_METHOD INSTALL="second" />
                    <INSTALL_OPTIONS softwareType="aaa" />
                    <INSTALL_OPTIONS softwareType="qqq" />
                    
                </FILE>
            </TASK>
            <TASK>
                 <VERSION>0.1</VERSION>
                 <FILE>
                   <INSTALL_METHOD INSTALL="third" />
                   <INSTALL_OPTIONS softwareType="aaa" />
                   <INSTALL_OPTIONS softwareType="qqq" />
                   
                 </FILE>
            </TASK>
        </BLOCK1>
    </BLOCK>
</PROJECT>

I need to insert another <INSTALL_OPTIONS> inside all the TASK tags apart from the first one, the result thus should look like this

<PROJECT>
    <UPDATE_TYPE>FULL</UPDATE_TYPE>
    <PROJECT_NAME>GEN20x_BALBOA</PROJECT_NAME>
    <AAA>000</AAA>
    <BBB>CIVIC</BBB>
    <CCC>ECE</CCC>
    <BLOCK>
        <BLOCK1>
            <TYPE>BOOT</TYPE>
            <TYPE>BOOT</TYPE>
            <TASK>
                <VERSION>0.1</VERSION>
                <FILE>           
                     <INSTALL_METHOD INSTALL="first" />
                     <INSTALL_OPTIONS softwareType="aaa" />
                     <INSTALL_OPTIONS softwareType="qqq" />
               <FILE> 
            </TASK>
            <TASK>
                <VERSION>0.1</VERSION>
                <FILE>
                    <INSTALL_METHOD INSTALL="second" />
                    <INSTALL_OPTIONS softwareType="aaa" />
                    <INSTALL_OPTIONS softwareType="qqq" />
                    <INSTALL_OPTIONS softwareType="new" />
                </FILE>
            </TASK>
            <TASK>
                 <VERSION>0.1</VERSION>
                 <FILE>
                   <INSTALL_METHOD INSTALL="third" />
                   <INSTALL_OPTIONS softwareType="aaa" />
                   <INSTALL_OPTIONS softwareType="qqq" />
                   <INSTALL_OPTIONS softwareType="new" />
                 </FILE>
            </TASK>
        </BLOCK1>
    </BLOCK>
</PROJECT>

Could someone please help me with this?

I tried the following way but coudn't skip the first tag

tasks = root.findall('.//BLOCK/BLOCK1/TASK')
new_io= ET.fromstring('<INSTALL_OPTIONS softwareType="new"/>')
for task in tasks:
    task.insert(3,new_io)

CodePudding user response:

I guess you could use enumerate when iterating over the tasks and ignore the first index.

Example:

import xml.etree.ElementTree as ET

xmldata = """<PROJECT>
    <UPDATE_TYPE>FULL</UPDATE_TYPE>
    <PROJECT_NAME>GEN20x_BALBOA</PROJECT_NAME>
    <AAA>000</AAA>
    <BBB>CIVIC</BBB>
    <CCC>ECE</CCC>
    <BLOCK>
        <BLOCK1>
            <TYPE>BOOT</TYPE>
            <TYPE>BOOT</TYPE>
            <TASK>
                <INSTALL_METHOD INSTALL="first" />
                <INSTALL_OPTIONS softwareType="aaa" />
                <INSTALL_OPTIONS softwareType="qqq" />
            </TASK>
            <TASK>
                <INSTALL_METHOD INSTALL="second" />
                <INSTALL_OPTIONS softwareType="aaa" />
                <INSTALL_OPTIONS softwareType="qqq" />
            </TASK>
            <TASK>
                <INSTALL_METHOD INSTALL="third" />
                <INSTALL_OPTIONS softwareType="aaa" />
                <INSTALL_OPTIONS softwareType="qqq" />
            </TASK>
        </BLOCK1>
    </BLOCK>
</PROJECT>"""

tree = ET.ElementTree(ET.fromstring(xmldata))
root = tree.getroot()
tasks = root.findall('.//BLOCK/BLOCK1/TASK')
new_io= ET.fromstring('<INSTALL_OPTIONS softwareType="new"/>')
for index, task in enumerate(tasks):
    if index > 0:
        task.insert(3,new_io)

print(ET.tostring(root, encoding='utf8').decode('utf8'))

Result:

<?xml version='1.0' encoding='utf8'?>
<PROJECT>
    <UPDATE_TYPE>FULL</UPDATE_TYPE>
    <PROJECT_NAME>GEN20x_BALBOA</PROJECT_NAME>
    <AAA>000</AAA>
    <BBB>CIVIC</BBB>
    <CCC>ECE</CCC>
    <BLOCK>
        <BLOCK1>
            <TYPE>BOOT</TYPE>
            <TYPE>BOOT</TYPE>
            <TASK>
                <INSTALL_METHOD INSTALL="first" />
                <INSTALL_OPTIONS softwareType="aaa" />
                <INSTALL_OPTIONS softwareType="qqq" />
            </TASK>
            <TASK>
                <INSTALL_METHOD INSTALL="second" />
                <INSTALL_OPTIONS softwareType="aaa" />
                <INSTALL_OPTIONS softwareType="qqq" />
            <INSTALL_OPTIONS softwareType="new" /></TASK>
            <TASK>
                <INSTALL_METHOD INSTALL="third" />
                <INSTALL_OPTIONS softwareType="aaa" />
                <INSTALL_OPTIONS softwareType="qqq" />
            <INSTALL_OPTIONS softwareType="new" /></TASK>
        </BLOCK1>
    </BLOCK>
</PROJECT>

CodePudding user response:

tasks is a list with three items. You are only interested in the last two. You can get a list with only those items by creating a slice as follows:

tasks = root.findall('.//BLOCK/BLOCK1/TASK')[1:]

CodePudding user response:

A couple of things: First, your sample xml is not well formed; the <FILE> element in the first <TASK> needs to be closed like so: </FILE>.

Second, since you are dealing with a nested xml, you are inserting the new element in the wrong place. Try something like this:

for task in tasks[1:]:
    task.find('.//FILE').insert(3,new_io)

The output, given your (corrected) sample xml, should be your expected (corrected) output.

  • Related