Today i try to convert simple xml to csv file. XML:
<cars>
<car>
<id>1234</id>
<name>Name Of Car</name>
<description>Nice Car! </description>
<image>http://www.images.com</image>
<cost>432</cost>
<cat>roadster</cat>
<property name="Url">someurl</property>
<property name="Color">red</property>
<property name="Produce">fiat</property>
<property name="SystemID">15</property>
</car>
</cars>
tree = ET.parse("test.xml")
root = tree.getroot()
Resident_data = open('ResidentData.csv', 'w', encoding='utf-8-sig')
csvwriter = csv.writer(Resident_data)
resident_head = []
count = 0
start = 0
for member in root.findall('car'):
offers = []
if count == 0:
resident_head.append("ID")
resident_head.append("NAME")
resident_head.append("DESC")
resident_head.append("IMG")
resident_head.append("PRICE")
resident_head.append("TYPE")
resident_head.append("URL")
resident_head.append("COLOR")
resident_head.append("PRODUCE")
resident_head.append("SYSID")
csvwriter.writerow(resident_head)
count = count 1
ID = member.find('id').text
offers.append(ID)
NAME = member.find('name').text
offers.append(NAME)
DESC = member.find('description').text
offers.append(DESC)
IMG = member.find('image').text
offers.append(IMG)
PRICE = member.find('cost').text
offers.append(PRICE)
CAT = member.find('cat').text
offers.append(CAT)
csvwriter.writerow(offers)
start = start 1
print(start)
Resident_data.close()
Output: |ID|NAME|DESC|IMG|PRICE|TYPE|URL|COLOR|PRODUCE|SYSID| |--|----|----|---|-----|----|---|-----|-------|-----| |1234|Name Of car|Nice Car!|http:..|432|roadster|
All working good but for this moment, but i don't know how to loop last columns, because here is <property name="x">
. Also sometimes SYSID or COLOR is none (empty in system)
Ps. In preview my table looks ok...
Thanks! :)
CodePudding user response:
See below (The code builds the csv headers dynamically based on the xml data)
import xml.etree.ElementTree as ET
import csv
xml = '''<?xml version="1.0" encoding="UTF-8"?>
<cars>
<car>
<id>1234</id>
<name>Name Of Car</name>
<description>Nice Car!</description>
<image>http://www.images.com</image>
<cost>432</cost>
<cat>roadster</cat>
<property name="Url">someurl</property>
<property name="Color">red</property>
<property name="Produce">fiat</property>
<property name="SystemID">15</property>
</car>
<car>
<id>1235</id>
<name>Name Of Car1</name>
<description>Nice Car!1</description>
<image>http://www.images.com</image>
<cost>435</cost>
<cat>roadster45</cat>
<property name="Url">someurl34</property>
<property name="Color">green</property>
<property name="Produce">nest</property>
<property name="SystemID">45</property>
</car>
</cars>'''
root = ET.fromstring(xml)
headers = []
data = []
for idx,car in enumerate(root.findall('.//car')):
if idx == 0:
for p in list(car):
if p.tag != 'property':
headers.append(p.tag.upper())
else:
headers.append(p.attrib['name'].upper())
data.append([p.text for p in list(car)])
with open('out.csv', 'w') as f:
wr = csv.writer(f)
wr.writerow(headers)
for entry in data:
wr.writerow(entry)
out.csv
ID,NAME,DESCRIPTION,IMAGE,COST,CAT,URL,COLOR,PRODUCE,SYSTEMID
1234,Name Of Car,Nice Car!,http://www.images.com,432,roadster,someurl,red,fiat,15
1235,Name Of Car1,Nice Car!1,http://www.images.com,435,roadster45,someurl34,green,nest,45
CodePudding user response:
If there's possible extra fields or some fields missing, it could be a good option to use DictWriter
. I your example <cars>
node (which is root node) contains only <car>
child nodes, so we can just iterate over root node. If your real XML is more complex, you can use .iterparse()
with simple .//car
XPath expression.
Code:
import xml.etree.ElementTree as ET
from csv import DictWriter
xml = ...
root = ET.fromstring(xml) # replace with initialization from file (if needed)
with open("result.csv", "w", newline="") as f:
writer = None
for car_node in root:
car = {node.get("name") or node.tag: node.text for node in car_node}
if not writer:
writer = DictWriter(f, car, extrasaction="ignore")
writer.writeheader()
writer.writerow(car)