the structure of the code is as shown below this is an xml file
<ROOT>
<data>
<record>
<field name="Country or Area">Afghanistan</field>
<field name="Year">2020</field>
<field name="Item">Gross Domestic Product (GDP)</field>
<field name="Value">508.453721937094</field>
</record>
<record>
<field name="Country or Area">Afghanistan</field>
<field name="Year">2019</field>
<field name="Item">Gross Domestic Product (GDP)</field>
<field name="Value">496.940552822825</field>
</record>
</data> </ROOT>
I have tried, i've tried other methods but no luck
from lxml import objectify
xml = objectify.parse('GDP_pc.xml')
root = xml.getroot()
data=[]
for i in range(len(root.getchildren())):
data.append([child.text for child in root.getchildren()[i].getchildren()])
df = pd.DataFrame(data)
df.columns = ['Country or Area', 'Year', 'Item', 'Value',]
CodePudding user response:
Have you tried the pandas method pd.read_xml()
?
It reads and transform a xml file into a dataframe
.
Just to the following:
df = pd.read_xml('GDP_pc.xml')
You can read more about it on the official documentation
CodePudding user response:
See below
import xml.etree.ElementTree as ET
import pandas as pd
xml = '''<ROOT>
<data>
<record>
<field name="Country or Area">Afghanistan</field>
<field name="Year">2020</field>
<field name="Item">Gross Domestic Product (GDP)</field>
<field name="Value">508.453721937094</field>
</record>
<record>
<field name="Country or Area">Afghanistan</field>
<field name="Year">2019</field>
<field name="Item">Gross Domestic Product (GDP)</field>
<field name="Value">496.940552822825</field>
</record>
</data>
</ROOT>'''
data = []
root = ET.fromstring(xml)
for rec in root.findall('.//record'):
data.append({field.attrib['name']: field.text for field in rec.findall('field')})
df = pd.DataFrame(data)
print(df)
output
Country or Area Year Item Value
0 Afghanistan 2020 Gross Domestic Product (GDP) 508.453721937094
1 Afghanistan 2019 Gross Domestic Product (GDP) 496.940552822825