Home > Enterprise >  Converting XML to dictionary in Python
Converting XML to dictionary in Python

Time:03-20

I need to convert part of my XML file into a python dict to store values in DB

XML

<unit id="storage[0].description"><new><datastore>sample text here</datastore></new></unit>
<unit id="storage[0].title"><new><datastore>image title</datastore></new></unit>
<unit id="storage[0].dokcs[0]"><new><datastore>elemntry101</datastore></new></unit>
<unit id="storage[0].dokcs[1]"><new><datastore>elemntry103</datastore></new></unit>

Expected python output

   'storage': [ {'description': 'sample text here', 'title': 'image title', 'dokcs': ['elemntry101', 'elemntry103']} }]

Please advice how to proceed to this structure using elementtree

I tried with this set of code

   for data in xml_string.findall(".//"):
                self.logger.error(f" {data.get('id')}:{data.findtext('datastore')}")

but I'm unable to proceed with structure I wanted.

CodePudding user response:

Try something like

import xml.etree.ElementTree as ET
doc = ET.fromstring(xml_string)

dats = [u.text.strip() for u in doc.findall('.//datastore') ]
storage = {}
doks = dats[2:]
storage['Description'],storage['Title'],storage['Dokcs']= dats[0],dats[1],doks
storage

CodePudding user response:

Maybe this is going to help you.

import re
import xml.etree.ElementTree as ET

from pprint import pprint

tree = ET.parse('data.xml')
root = tree.getroot()

ids = [i.attrib['id'].split('.') for i in root.iter('unit')]
texts = [e.text for e in root.iter("datastore")]
xml_dict = {}

for id in ids:
  id[0] = re.findall("^([a-zA-Z0-9]*)", id[0])[0]
  id[1] = re.findall("^([a-zA-Z0-9]*)", id[1])[0]

  xml_dict[id[0]] = []

for i, id in enumerate(ids):
  xml_dict_keys = [list(i.keys())[0] for i in xml_dict[id[0]]]

  if id[1] in xml_dict_keys:
    for e in xml_dict[id[0]]:
      if id[1] in list(e.keys())[0]:
        e[id[1]].append(texts[i])
  else:
    xml_dict[id[0]].append({
        id[1]: [texts[i]]
    })

pprint(xml_dict)

OUTPUT:

 {'storage': [{'description': ['sample text here']},
             {'title': ['image title']},
             {'dokcs': ['elemntry101', 'elemntry103']}]}
  • Related