trying to parse below XML which seems to be a different model.
<?xml version="1.0" encoding="UTF-8"?>
<book>
<item neighbor-name="ABC-LENGTH" pos="1" size="8" type="INT"/>
<item neighbor-name="ABC-CODE" pos="9" size="3" type="STRING"/>
<item neighbor-name="DEF-IND" pos="12" size="1" type="STRING"/>
<item neighbor-name="JKL-ID" pos="13" size="15" type="STRING"/>
<item neighbor-name="KLN-DATE" pos="28" size="8" type="STRING" red="true">
<item neighbor-name="KER-YR" pos="28" size="4" type="INT"/>
<item neighbor-name="KER-MO" pos="32" size="2" type="INT"/>
<item neighbor-name="KER-DA" pos="34" size="2" type="INT"/>
</item>
</book>
Trying to pull only the assigned values thru the parser.
ABC-LENGTH 1 8 INT
ABC-CODE 9 3 STRING
.
.
KLN-DATE 28 8 STRING true
.
.
But , nothing seems to be working. Tried all the options like tag,attribute etc.. but each time getting return code as zero , but no output.
Thanks in advance.
CodePudding user response:
I copied your XML in a file named "book.xml".
Than you can easy walk through with .iter()
and grap the values of the attributes with .get()
:
import pandas as pd
import xml.etree.ElementTree as ET
tree = ET.parse("book.xml")
root = tree.getroot()
columns = ["neighbor-name", "pos", "size", "type", "red"]
data = []
for node in root.iter("item"):
a = [node.get("neighbor-name"), node.get("pos"), node.get("size"), node.get("type"), node.get("red")]
data.append(a)
df = pd.DataFrame(data, columns = columns)
print(df)
Output:
neighbor-name pos size type red
0 ABC-LENGTH 1 8 INT None
1 ABC-CODE 9 3 STRING None
2 DEF-IND 12 1 STRING None
3 JKL-ID 13 15 STRING None
4 KLN-DATE 28 8 STRING true
5 KER-YR 28 4 INT None
6 KER-MO 32 2 INT None
7 KER-DA 34 2 INT None