how to parse the xml thru xml parser by using xml.etree.ElementTree with the below sample-CodePudding

trying to parse below XML which seems to be a different model.

  <?xml version="1.0" encoding="UTF-8"?>
<book>
<item neighbor-name="ABC-LENGTH" pos="1" size="8" type="INT"/>
<item neighbor-name="ABC-CODE" pos="9" size="3" type="STRING"/>
<item neighbor-name="DEF-IND" pos="12" size="1" type="STRING"/>
<item neighbor-name="JKL-ID" pos="13" size="15" type="STRING"/>
<item neighbor-name="KLN-DATE" pos="28" size="8" type="STRING" red="true">
    <item neighbor-name="KER-YR" pos="28" size="4" type="INT"/>
    <item neighbor-name="KER-MO" pos="32" size="2" type="INT"/>
    <item neighbor-name="KER-DA" pos="34" size="2" type="INT"/>
  </item>
 </book>

Trying to pull only the assigned values thru the parser.

        ABC-LENGTH       1           8       INT
        ABC-CODE         9           3       STRING
        .
        .
        KLN-DATE        28           8       STRING      true
        .
        .

But , nothing seems to be working. Tried all the options like tag,attribute etc.. but each time getting return code as zero , but no output.

Thanks in advance.

CodePudding user response：

I copied your XML in a file named "book.xml". Than you can easy walk through with .iter() and grap the values of the attributes with .get():

import pandas as pd
import xml.etree.ElementTree as ET 

tree = ET.parse("book.xml")
root = tree.getroot()

columns = ["neighbor-name", "pos", "size", "type", "red"]
data = []

for node in root.iter("item"):
    a = [node.get("neighbor-name"), node.get("pos"), node.get("size"), node.get("type"), node.get("red")]
    data.append(a)

df = pd.DataFrame(data, columns = columns)
print(df)

Output:

  neighbor-name pos size    type   red
0    ABC-LENGTH   1    8     INT  None
1      ABC-CODE   9    3  STRING  None
2       DEF-IND  12    1  STRING  None
3        JKL-ID  13   15  STRING  None
4      KLN-DATE  28    8  STRING  true
5        KER-YR  28    4     INT  None
6        KER-MO  32    2     INT  None
7        KER-DA  34    2     INT  None