Home > Blockchain >  how to export df to xml with nested condition
how to export df to xml with nested condition

Time:04-11

I need to write this df into an xml file:

Tree Dig    Ton         State   Dest
2   0122    national    normal  BO02GNP
2   1780    national    normal  D8NNG03
66  6621    national    normal  BO02GNP
66  6622    national    normal  BO02GNP

the desired result should be:

<session-control>
    <digit-analysis>
        <digit-analysis-trees>
            <digit-analysis-tree>
                <analysis-tree>2</analysis-tree>
                <digit-analysis-list>
                    <digit-analysis>
                        <digits>0122</digits>
                        <type-of-number>national</type-of-number>
                        <state>normal</state>
                        <result-destination>BO02GNP</result-destination>
                    </digit-analysis>
                    <digit-analysis>
                        <digits>01780</digits>
                        <type-of-number>national</type-of-number>
                        <state>normal</state>
                        <result-destination>D8NNG03</result-destination>
                    </digit-analysis>
                </digit-analysis-list>
            </digit-analysis-tree>
            <digit-analysis-tree>
                <analysis-tree>66</analysis-tree>
                <digit-analysis-list>
                    <digit-analysis>
                        <digits>6610</digits>
                        <type-of-number>national</type-of-number>
                        <state>normal</state>
                        <result-destination>BO02GNP</result-destination>
                    </digit-analysis>
                    <digit-analysis>
                        <digits>6611</digits>
                        <type-of-number>national</type-of-number>
                        <state>normal</state>
                        <result-destination>BO02GNP</result-destination>
                    </digit-analysis>
                </digit-analysis-list>
            </digit-analysis-tree>
        </digit-analysis-trees>
    </digit-analysis>
</session-control>

The condition is that every time the df.Tree value change, I need to close the "digit-analysis-tree" tag and open a new one.

I wrote this:

with open ('df1.xml', "ab") as f :
digana = ETree.Element('digit-analysis')
root.append(digana)
tree = ETree.ElementTree(root)
digana_trees = ETree.SubElement(digana, 'digit-analysis-trees', xmlns='http://nokia.com/nokia-tas/digana')
tree_checker = ''
if df_final.Tree != tree_checker:
    digana_tree = ETree.SubElement(digana_trees ,'digit-analysis-tree')
    ana_tree = ETree.SubElement(digana_tree, 'analysis-tree')
    ana_tree.text = df_final.Tree
    digana_list = ETree.SubElement(digana_tree, 'digit-analysis-list')
    digana_child = ETree.SubElement(digana_list, 'digit-analysis')
    digits = ETree.SubElement(digana_child, 'digits')
    digits.text = df_final.Dig
    ton = ETree.SubElement(digana_child, 'type-of-number')
    ton.text = df_final.Ton
    state = ETree.SubElement(digana_child, 'state')
    state.text = 'normal'
    res_dest = ETree.SubElement(digana_child, 'result_destination')
    res_dest.text = df_final.Dest

    tree_checker = ana_tree.text
else:
    digana_list = ETree.SubElement(digana_tree, 'digit-analysis-list')
    digana_child = ETree.SubElement(digana_list, 'digit-analysis')
    digits = ETree.SubElement(digana_child, 'digits')
    digits.text = df_final.Dig
    ton = ETree.SubElement(digana_child, 'type-of-number')
    ton.text = df_final.Ton
    state = ETree.SubElement(digana_child, 'state')
    state.text = 'normal'
    res_dest = ETree.SubElement(digana_child, 'result_destination')
    res_dest.text = df_final.Dest

tree.write(f, encoding='UTF-8', xml_declaration=True)

but it returns to me this error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

I tried to use pandas.DataFrame.to_xml() but I was not able to reach my goal ...

Thanks for your help

CodePudding user response:

Your XML has too many unnecessary nesting levels. Try to cut them down if possible.

It makes for a simpler loop if your group by the Tree ID before generating the XML:

from xml.etree import ElementTree as ETree
from io import StringIO

template = """
<session-control>
    <digit-analysis>
        <digit-analysis-trees />
    </digit-analysis>
</session-control>
"""

xml_tree = ETree.parse(StringIO(template))
all_da_trees = xml_tree.find(".//digit-analysis-trees")

for tree_id, group in df.groupby("Tree"):
    da_tree = ETree.SubElement(all_da_trees, "digit-analysis-tree")
    a_tree = ETree.SubElement(da_tree, "analysis-tree")
    a_tree.text = str(tree_id)
    da_list = ETree.SubElement(da_tree, "digit-analysis-list")

    for _, row in group.iterrows():
        elements = {
            "digits": row["Dig"],
            "type-of-number": row["Ton"],
            "state": row["State"],
            "result-destination": row["Dest"]
        }

        da = ETree.SubElement(da_list, "digit-analysis")
        for key, value in elements.items():
            ele = ETree.SubElement(da, key)
            ele.text = str(value)
  • Related