I'm having an issue with my XML file. I would like to achieve the same as in: https://www.delftstack.com/howto/python/xml-to-csv-python/
However, my XML file looks a bit different, for example:
<students>
<student name="Rick Grimes" rollnumber="1" age="15"/>
<student name="Lori Grimes" rollnumber="2" age="16"/>
<student name="Judith Grimes" rollnumber="4" age="13"/>
</students>
The code specified in the link does not work with this formatting.
from xml.etree import ElementTree
tree = ElementTree.parse("input.xml")
root = tree.getroot()
for student in root:
name = student.find("name").text
roll_number = student.find("rollnumber").text
age = student.find("age").text
print(f"{name},{roll_number},{age}")
I have very little coding experience, so hoping someone on here can help me out.
Expected result:
Rick Grimes,1,15 Lori Grimes,2,16 Carl Grimes,3,14 Judith Grimes,4,13
Actual result:
AttributeError: 'NoneType' object has no attribute 'text'
CodePudding user response:
text
refers to the actual text of the tag. To make it clear:
<student> text here </student>
You don't have any here since your tags are autoclosing. What you are looking for is the tag attribute attrib
: doc here
Something like this should help you get what you're looking for:
for student in root:
print(student.attrib)
CodePudding user response:
You cannot get the text if there aren't any text to get.
Instead you want to use .attrib[key]
as you have the values as attributes.
I have modified your example so that it will work with your XML file.
from xml.etree import ElementTree
tree = ElementTree.parse("input.xml")
root = tree.getroot()
for student in root:
name = student.attrib["name"]
roll_number = student.attrib["rollnumber"]
age = student.attrib["age"]
print(f"{name},{roll_number},{age}")
I hope this will help you.
CodePudding user response:
import io
from xml.etree import ElementTree
xml_string = """<students>
<student name="Rick Grimes" rollnumber="1" age="15"/>
<student name="Lori Grimes" rollnumber="2" age="16"/>
<student name="Judith Grimes" rollnumber="4" age="13"/>
</students>"""
file = io.StringIO(xml_string)
tree = ElementTree.parse(file)
root = tree.getroot()
result = ""
for student in root:
result = f"{student.attrib['name']},{student.attrib['rollnumber']},{student.attrib['age']} "
print(result)
result
Rick Grimes,1,15 Lori Grimes,2,16 Judith Grimes,4,13
CodePudding user response:
For such easy structured XML you can use also the build in function from pandas in two lines of code:
import pandas as pd
df = pd.read_xml('caroline.xml', xpath='.//student')
csv = df.to_csv('caroline.csv', index=False)
# For visualization only
with open('caroline.csv', 'r') as f:
lines = f.readlines()
for line in lines:
print(line)
Output:
name,rollnumber,age
Rick Grimes,1,15
Lori Grimes,2,16
Judith Grimes,4,13
With the option header=False
you can also switch off to write the header to the csv file.