I need to write this df into an xml file:
Tree Dig Ton State Dest
2 0122 national normal BO02GNP
2 1780 national normal D8NNG03
66 6621 national normal BO02GNP
66 6622 national normal BO02GNP
the desired result should be:
<session-control>
<digit-analysis>
<digit-analysis-trees>
<digit-analysis-tree>
<analysis-tree>2</analysis-tree>
<digit-analysis-list>
<digit-analysis>
<digits>0122</digits>
<type-of-number>national</type-of-number>
<state>normal</state>
<result-destination>BO02GNP</result-destination>
</digit-analysis>
<digit-analysis>
<digits>01780</digits>
<type-of-number>national</type-of-number>
<state>normal</state>
<result-destination>D8NNG03</result-destination>
</digit-analysis>
</digit-analysis-list>
</digit-analysis-tree>
<digit-analysis-tree>
<analysis-tree>66</analysis-tree>
<digit-analysis-list>
<digit-analysis>
<digits>6610</digits>
<type-of-number>national</type-of-number>
<state>normal</state>
<result-destination>BO02GNP</result-destination>
</digit-analysis>
<digit-analysis>
<digits>6611</digits>
<type-of-number>national</type-of-number>
<state>normal</state>
<result-destination>BO02GNP</result-destination>
</digit-analysis>
</digit-analysis-list>
</digit-analysis-tree>
</digit-analysis-trees>
</digit-analysis>
</session-control>
The condition is that every time the df.Tree value change, I need to close the "digit-analysis-tree" tag and open a new one.
I wrote this:
with open ('df1.xml', "ab") as f :
digana = ETree.Element('digit-analysis')
root.append(digana)
tree = ETree.ElementTree(root)
digana_trees = ETree.SubElement(digana, 'digit-analysis-trees', xmlns='http://nokia.com/nokia-tas/digana')
tree_checker = ''
if df_final.Tree != tree_checker:
digana_tree = ETree.SubElement(digana_trees ,'digit-analysis-tree')
ana_tree = ETree.SubElement(digana_tree, 'analysis-tree')
ana_tree.text = df_final.Tree
digana_list = ETree.SubElement(digana_tree, 'digit-analysis-list')
digana_child = ETree.SubElement(digana_list, 'digit-analysis')
digits = ETree.SubElement(digana_child, 'digits')
digits.text = df_final.Dig
ton = ETree.SubElement(digana_child, 'type-of-number')
ton.text = df_final.Ton
state = ETree.SubElement(digana_child, 'state')
state.text = 'normal'
res_dest = ETree.SubElement(digana_child, 'result_destination')
res_dest.text = df_final.Dest
tree_checker = ana_tree.text
else:
digana_list = ETree.SubElement(digana_tree, 'digit-analysis-list')
digana_child = ETree.SubElement(digana_list, 'digit-analysis')
digits = ETree.SubElement(digana_child, 'digits')
digits.text = df_final.Dig
ton = ETree.SubElement(digana_child, 'type-of-number')
ton.text = df_final.Ton
state = ETree.SubElement(digana_child, 'state')
state.text = 'normal'
res_dest = ETree.SubElement(digana_child, 'result_destination')
res_dest.text = df_final.Dest
tree.write(f, encoding='UTF-8', xml_declaration=True)
but it returns to me this error:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I tried to use pandas.DataFrame.to_xml() but I was not able to reach my goal ...
Thanks for your help
CodePudding user response:
Your XML has too many unnecessary nesting levels. Try to cut them down if possible.
It makes for a simpler loop if your group by the Tree ID before generating the XML:
from xml.etree import ElementTree as ETree
from io import StringIO
template = """
<session-control>
<digit-analysis>
<digit-analysis-trees />
</digit-analysis>
</session-control>
"""
xml_tree = ETree.parse(StringIO(template))
all_da_trees = xml_tree.find(".//digit-analysis-trees")
for tree_id, group in df.groupby("Tree"):
da_tree = ETree.SubElement(all_da_trees, "digit-analysis-tree")
a_tree = ETree.SubElement(da_tree, "analysis-tree")
a_tree.text = str(tree_id)
da_list = ETree.SubElement(da_tree, "digit-analysis-list")
for _, row in group.iterrows():
elements = {
"digits": row["Dig"],
"type-of-number": row["Ton"],
"state": row["State"],
"result-destination": row["Dest"]
}
da = ETree.SubElement(da_list, "digit-analysis")
for key, value in elements.items():
ele = ETree.SubElement(da, key)
ele.text = str(value)