Its great pleasure to meet you via this platform.. As beginner, I have question. Here, I have sample XML document. I have to convert it to csv form from xml form.
<message>
<mail>[email protected]</mail>
<address>
<houseno> 123 </houseno>
</address>
</message>
<message>
<mail>[email protected]</mail>
<contact> 23278378</contact>
<address>
<houseno> 123 </houseno>
</address>
</message>
<message>
<mail>[email protected]</mail>
<address>
<houseno> 123 </houseno>
</address>
<place> Mumbai </place>
</message>
In above xml , I want to access mail , contact address and place and stored in csv file.. There are no common tags in message tag in this document. hence I am unable to access it.. Kindly help me how can I do it..?
CodePudding user response:
One approach would be to install and use beautifulsoup
. This can locate all of the message
entries. For example:
from bs4 import BeautifulSoup
import csv
xml = """<message>
<mail>[email protected]</mail>
<address>
<houseno> 123 </houseno>
</address>
</message>
<message>
<mail>[email protected]</mail>
<contact> 23278378</contact>
<address>
<houseno> 123 </houseno>
</address>
</message>
<message>
<mail>[email protected]</mail>
<address>
<houseno> 123 </houseno>
</address>
<place> Mumbai </place>
</message>"""
soup = BeautifulSoup(xml, "html.parser")
with open('output.csv', 'w', newline='', encoding='utf-8') as f_output:
csv_output = csv.writer(f_output)
csv_output.writerow(['Mail', 'Address', 'Place'])
for message in soup.find_all('message'):
mail = message.mail.text
address = message.address.houseno.get_text(strip=True)
place = message.place.get_text(strip=True) if message.place else ''
csv_output.writerow([mail, address, place])
This would create a CSV output file as:
Mail,Address,Place
[email protected],123,
[email protected],123,
[email protected],123,Mumbai