I'm trying to parse a nested XML with ElementTree.
My xml has: .... .....
<JOB D="32" APPLICATION="DUDDY" PARENT_FOLDER="FOLDEXMAPLEL" END_FOLDER="N">
<VARIABLE NAME="%%YESTER" VALUE="%%$CALCDATE %%$ODATE -2" />
<VARIABLE NAME="%%Y2020" VALUE="GO" />
<VARIABLE NAME="%%Y2021" VALUE="G1" />
<VARIABLE NAME="%%Y2022" VALUE="G2" />
<VARIABLE NAME="%%M01" VALUE="1" />
<VARIABLE NAME="%%M02" VALUE="2" />
<VARIABLE NAME="%%M03" VALUE="3" />
<VARIABLE NAME="%%M04" VALUE="4" />
<VARIABLE NAME="%%M05" VALUE="5" />
<VARIABLE NAME="%%M06" VALUE="6" />
<VARIABLE NAME="%%M07" VALUE="7" />
<VARIABLE NAME="%%PEP-APP_NAME" VALUE="DDDD" />
<VARIABLE NAME="%%PEP-PLATFORM_NAME" VALUE="LIST" />
<VARIABLE NAME="%%PEP-TASK_NAME" VALUE="LIST-may" />
<VARIABLE NAME="%%PEP-ARGUMENTS" VALUE="%%Argument" />
<VARIABLE NAME="%%PEP-ACCOUNT" VALUE="VALUESSSS" />
<VARIABLE NAME="%%Y2023" VALUE="G3" />
.........
I'm trying to get all VALUE
s when the value of NAME
begins with %%PEP-*
.
I'm trying with the find
method but it doesn't work.
for try in j_nodeOS.find(f"./VARIABLE[@NAME='']")
CodePudding user response:
You can check the line using startswith()
. And, assuming you have the same format for all VARIABLE
lines, I would just split the line until I find the needed value.
with open('path/to/file', 'r', encoding='utf-8') as file:
for line in file:
if line.strip().startswith('<VARIABLE NAME="%%PEP-'):
print(line.split()[2].split("=")[1])
This would give you
"DDDD"
"LIST"
"LIST-may"
"%%Argument"
"VALUESSSS"
CodePudding user response:
It can be done with ElementTree, though it's a bit cumbersome:
import xml.etree.ElementTree as ET
doc = ET.fromstring([your xml above, well formed])
for v in doc.findall('.//VARIABLE'):
if v.attrib['NAME'].startswith("%%PEP"):
print(v.attrib['VALUE'])
It's far easier with lxml:
from lxml import etree
doc = etree.fromstring([your xml above, well formed])
doc2.xpath('//VARIABLE[starts-with(@NAME,"%%PEP")]/@VALUE')
and it should get you the same output:
DDDD
LIST
LIST-may
%%Argument
VALUESSSS