i have a question. How can i create list of a list from xml dataset in ElemenTree python? for example i have this dataset :
<data>
<article>
<author> Tony </author>
</article>
<article>
<author> Andy </author>
<author> John </author>
</article>
<article>
<author> Paul </author>
<author> leon </author>
</article>
</data>
There is a specific function like :
tree = ET.parse('data.xml')
root = tree.getroot()
for author in root.iter('author'):
print(author.text, author.attrib['pid'])
that would find and show all author in dataset in single list [Tony, Andy, John, Paul, Leon]
. How can i improve those code above so i can get result author in list of s list [[Tony], [Andy, John], [Paul, Leon]]
? Maybe theres is a specific function to perform this?
CodePudding user response:
I found the solution. It's quite simple. Just create nested loop to iterate child
from root
and then child2
from child
with specific tag.
author_connection = []
for child in root:
conn = []
author_connection.append(conn)
for child2 in child.iter('author'):
conn.append(child2.text)
And the output is
[[Tony], [Andy, John], [Paul, Leon]]
CodePudding user response:
One liner can do the magic
import xml.etree.ElementTree as ET
xml = '''<data>
<article>
<author> Tony </author>
</article>
<article>
<author> Andy </author>
<author> John </author>
</article>
<article>
<author> Paul </author>
<author> leon </author>
</article>
</data>'''
root = ET.fromstring(xml)
authors = [[aut.text for aut in art.findall('author')] for art in root.findall('./article')]
print(authors)
output
[[' Tony '], [' Andy ', ' John '], [' Paul ', ' leon ']]