I'm currently using xml.etree.cElementTree
in Python to parse XML files. I would like to know if it's possible to read data from another file and insert it at a specific location in an XML file.
Here's the XML file I'm working with:
<Data>
<Action>A</Action>
<FinalDate>2018-08-24</FinalDate>
<InitialDate>2011-08-19</InitialDate>
<DateOperation>
<DateOperationCode>Append</DateOperationCode>
<Date>2017-08-21</Date>
</DateOperation>
<Data>
In this file, I would like to insert dates that are read from another file as text after the line "2017-08-21", so that the updated XML file can look like this:
<Data>
<Action>A</Action>
<FinalDate>2018-08-24</FinalDate>
<InitialDate>2011-08-19</InitialDate>
<DateOperation>
<DateOperationCode>Append</DateOperationCode>
<Date>2017-08-21</Date>
<Date>2017-09-21</Date> #new date
<Date>2017-10-21</Date> #new date
<Date>2017-11-21</Date> #new date
</DateOperation>
<Data>
I tried different ways to insert the dates, but none have worked so far.
CodePudding user response:
I would recommend, in this case, using lxml instead of ElementTree, because of the former's better xpath support.
Also, both your xml files are not well formed, for various reasons.
So, assuming I understand you correctly, I would do the following:
from lxml import etree
#the xml in both files is fixed, as I best understand it
file1 = """<Data>
<Action>A</Action>
<FinalDate>2018-08-24</FinalDate>
<InitialDate>2011-08-19</InitialDate>
<DateOperation>
<DateOperationCode>Append</DateOperationCode>
<Date>2017-08-21</Date>
</DateOperation>
</Data>
"""
file2 ="""<Data>
<Action>A</Action>
<FinalDate>2018-08-24</FinalDate>
<InitialDate>2011-08-19</InitialDate>
<DateOperation>
<DateOperationCode>Append</DateOperationCode>
<Date>2017-08-21</Date>
<Date>2017-09-21</Date>
<Date>2017-10-21</Date>
<Date>2017-11-21</Date>
</DateOperation>
</Data>"""
doc1 = etree.XML(file1)
doc2 = etree.XML(file2)
baseline= doc1.xpath('//DateOperation/Date/text()')[0]
dest = doc1.xpath('//DateOperation')[0]
for d in doc2.xpath(f'//Date[.="{baseline}"]//following-sibling::Date'):
dest.append(d)
print(etree.tostring(doc1).decode())
Output:
<Data>
<Action>A</Action>
<FinalDate>2018-08-24</FinalDate>
<InitialDate>2011-08-19</InitialDate>
<DateOperation>
<DateOperationCode>Append</DateOperationCode>
<Date>2017-08-21</Date>
<Date>2017-09-21</Date>
<Date>2017-10-21</Date>
<Date>2017-11-21</Date>
</DateOperation>
</Data>
I notice that file1, after the modifications, looks very much like file2. If that's the case in real life, why not just use file2?
CodePudding user response:
The below works
import xml.etree.ElementTree as ET
import pandas as pd
xml = '''<Data>
<Action>A</Action>
<FinalDate>2018-08-24</FinalDate>
<InitialDate>2011-08-19</InitialDate>
<DateOperation>
<DateOperationCode>Append</DateOperationCode>
<Date>2017-08-21</Date>
</DateOperation>
</Data>
'''
root = ET.fromstring(xml)
dates_to_add = ['2022-09-21', '2017-03-11', '2017-05-25']
date_oper_root = root.find('.//DateOperation')
for date in dates_to_add:
new_date = ET.Element("Date")
new_date.text = date
date_oper_root.append(new_date)
ET.dump(root)