Home > Software design >  How do I insert a specific element/text at a specific location in an XML file using python?
How do I insert a specific element/text at a specific location in an XML file using python?

Time:09-21

I'm currently using xml.etree.cElementTree in Python to parse XML files. I would like to know if it's possible to read data from another file and insert it at a specific location in an XML file.

Here's the XML file I'm working with:

<Data>
   <Action>A</Action>
   <FinalDate>2018-08-24</FinalDate>
   <InitialDate>2011-08-19</InitialDate>
   <DateOperation>
       <DateOperationCode>Append</DateOperationCode>
       <Date>2017-08-21</Date>
   </DateOperation>
<Data>

In this file, I would like to insert dates that are read from another file as text after the line "2017-08-21", so that the updated XML file can look like this:

<Data>
   <Action>A</Action>
   <FinalDate>2018-08-24</FinalDate>
   <InitialDate>2011-08-19</InitialDate>
   <DateOperation>
       <DateOperationCode>Append</DateOperationCode>
       <Date>2017-08-21</Date>
       <Date>2017-09-21</Date> #new date
       <Date>2017-10-21</Date> #new date
       <Date>2017-11-21</Date> #new date
   </DateOperation>
<Data>

I tried different ways to insert the dates, but none have worked so far.

CodePudding user response:

I would recommend, in this case, using lxml instead of ElementTree, because of the former's better xpath support.

Also, both your xml files are not well formed, for various reasons.

So, assuming I understand you correctly, I would do the following:

from lxml import etree

#the xml in both files is fixed, as I best understand it
file1 = """<Data>
   <Action>A</Action>
   <FinalDate>2018-08-24</FinalDate>
   <InitialDate>2011-08-19</InitialDate>
   <DateOperation>
       <DateOperationCode>Append</DateOperationCode>
       <Date>2017-08-21</Date>
    </DateOperation>
</Data>
"""
file2 ="""<Data>
   <Action>A</Action>
   <FinalDate>2018-08-24</FinalDate>
   <InitialDate>2011-08-19</InitialDate>
   <DateOperation>
       <DateOperationCode>Append</DateOperationCode>
       <Date>2017-08-21</Date>
       <Date>2017-09-21</Date>
       <Date>2017-10-21</Date> 
       <Date>2017-11-21</Date>
        </DateOperation>
</Data>"""

doc1 = etree.XML(file1)
doc2 = etree.XML(file2)
baseline= doc1.xpath('//DateOperation/Date/text()')[0]

dest = doc1.xpath('//DateOperation')[0]
for d in doc2.xpath(f'//Date[.="{baseline}"]//following-sibling::Date'):
    dest.append(d)
print(etree.tostring(doc1).decode())

Output:

<Data>
   <Action>A</Action>
   <FinalDate>2018-08-24</FinalDate>
   <InitialDate>2011-08-19</InitialDate>
   <DateOperation>
       <DateOperationCode>Append</DateOperationCode>
       <Date>2017-08-21</Date>
    <Date>2017-09-21</Date>
       <Date>2017-10-21</Date> 
       <Date>2017-11-21</Date>
        </DateOperation>
</Data>

I notice that file1, after the modifications, looks very much like file2. If that's the case in real life, why not just use file2?

CodePudding user response:

The below works

import xml.etree.ElementTree as ET
import pandas as pd

xml = '''<Data>
   <Action>A</Action>
   <FinalDate>2018-08-24</FinalDate>
   <InitialDate>2011-08-19</InitialDate>
   <DateOperation>
       <DateOperationCode>Append</DateOperationCode>
       <Date>2017-08-21</Date>
  </DateOperation>
</Data>
'''

root = ET.fromstring(xml)
dates_to_add = ['2022-09-21', '2017-03-11', '2017-05-25']
date_oper_root = root.find('.//DateOperation')
for date in dates_to_add:
    new_date = ET.Element("Date")
    new_date.text = date
    date_oper_root.append(new_date)
ET.dump(root)
  • Related