Home > Software design >  Parser xml with python don't generate all tree iterator
Parser xml with python don't generate all tree iterator

Time:12-24

If I have a xml file such as:

<JOB JOBISN="443" JJJJ="JOBDSFSFDF" INT="00001M"  USERCHANGE="ssss" FOLD="ARDD">
    <OUTCOND NAME="JOBDDDDD-TODDDDD" SIGNED="DDDDDDD" />
</JOB>

I'm trying to get all elements but when I do:

with open(file1, 'rt') as f:

    tree = ElementTree.parse(f)

for node in tree.iter('JOB'):
    for node2 in node.items():
        print(node2)

I can see all items like these:

('JOBISN', '443')
('JJJJ', 'JOBDSFSFDF')
('INT', '00001M')
('USERCHANGE', 'ssss')
('FOLD', 'ARDD')

But I can't see these line :

<OUTCOND NAME="JOBDDDDD-TODDDDD" SIGNED="DDDDDDD" />

Do you know why? Thanks

CodePudding user response:

Function iter() allows you to iterate through attributes.

<JOB JOBISN="443" JJJJ="JOBDSFSFDF" INT="00001M"  USERCHANGE="ssss" FOLD="ARDD">
    <OUTCOND NAME="JOBDDDDD-TODDDDD" SIGNED="DDDDDDD" />
</JOB>

In root element iter() gives you attributes like 'JOBISN' etc. To get Outcond you neet do get children of this element. Children are inside parents (in your case parent is Job because it's "outside") . To get children of this element you need to call function getchildren()

with open(file1, 'rt') as f:

    tree = ElementTree.parse(f)

root = tree.getroot()

for node in root.getchildren():
    print(node)

CodePudding user response:

See below (Use XPATH to find the JOB)

import xml.etree.ElementTree as ET
import pprint


xml = '''<JOBS>
<JOB JOBISN="443" JJJJ="JOBDSFSFD21F" INT="00001M"  USERCHANGE="ssss" FOLD="ARDD">
    <OUTCOND NAME="JOBDDDDD-TODDDDD11" SIGNED="DDDDDDD" />
</JOB>
<JOB JOBISN="643" JJJJ="JOBDSFSFDF44" INT="00001M"  USERCHANGE="ssss" FOLD="ARDD">
    <OUTCOND NAME="JOBDDDDD-TODDDDD12" SIGNED="DDDDDDD" />
</JOB>
</JOBS>'''

root = ET.fromstring(xml)
for job in root.findall('.//JOB'):
  attrs = job.attrib
  attrs['OUTCOND_NAME'] = job.find('OUTCOND').attrib['NAME']
  pprint.pprint(attrs)

output

{'FOLD': 'ARDD',
 'INT': '00001M',
 'JJJJ': 'JOBDSFSFD21F',
 'JOBISN': '443',
 'OUTCOND_NAME': 'JOBDDDDD-TODDDDD11',
 'USERCHANGE': 'ssss'}
{'FOLD': 'ARDD',
 'INT': '00001M',
 'JJJJ': 'JOBDSFSFDF44',
 'JOBISN': '643',
 'OUTCOND_NAME': 'JOBDDDDD-TODDDDD12',
 'USERCHANGE': 'ssss'}
  • Related