Given XML formatted data I want to read it in key-value fromat.
For example, Given:
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
<example></example>
</note>
I want to have:
[[to, Tove], [from, Jani], [heading, Reminder], [body, Don't forget me this weekend!], [example, ]]
As you can see I have not included note in my output as it doesn't have a matching key.
CodePudding user response:
Please note that the the expected output mentioned in your OP is not a valid python syntax. But I assume that you need a list of list(with two string objects, tag
and text
). If that's the case, you can use built-in xml.etree.ElementTree
module.
>>> import xml.etree.ElementTree as ET
>>>
>>> s = """
... <note>
... <to>Tove</to>
... <from>Jani</from>
... <heading>Reminder</heading>
... <body>Don't forget me this weekend!</body>
... <example></example>
... </note>
... """
>>>
>>>
>>> [[child.tag, child.text if child.text else ""] for child in ET.fromstring(s)]
[['to', 'Tove'], ['from', 'Jani'], ['heading', 'Reminder'], ['body', "Don't forget me this weekend!"], ['example', '']]
CodePudding user response:
You can use xmltodict
to convert it to a dictionary:
xml_text = """<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
<example></example>
</note>"""
import xmltodict
xml_dict = xmltodict.parse(xml_text)
print(xml_dict)
# {'note': {'to': 'Tove', 'from': 'Jani', 'heading': 'Reminder', 'body': "Don't forget me this weekend!", 'example': None}}
You can then access each value by name:
xml_dict['note']['from'] -> 'Jani'