Get tag when attribute value begins with a pattern-CodePudding

I'm trying to parse a nested XML with ElementTree.

My xml has: .... .....

       <JOB D="32" APPLICATION="DUDDY"  PARENT_FOLDER="FOLDEXMAPLEL" END_FOLDER="N">
            <VARIABLE NAME="%%YESTER" VALUE="%%$CALCDATE %%$ODATE -2" />
            <VARIABLE NAME="%%Y2020" VALUE="GO" />
            <VARIABLE NAME="%%Y2021" VALUE="G1" />
            <VARIABLE NAME="%%Y2022" VALUE="G2" />
            <VARIABLE NAME="%%M01" VALUE="1" />
            <VARIABLE NAME="%%M02" VALUE="2" />
            <VARIABLE NAME="%%M03" VALUE="3" />
            <VARIABLE NAME="%%M04" VALUE="4" />
            <VARIABLE NAME="%%M05" VALUE="5" />
            <VARIABLE NAME="%%M06" VALUE="6" />
            <VARIABLE NAME="%%M07" VALUE="7" />
            <VARIABLE NAME="%%PEP-APP_NAME" VALUE="DDDD" />
            <VARIABLE NAME="%%PEP-PLATFORM_NAME" VALUE="LIST" />
            <VARIABLE NAME="%%PEP-TASK_NAME" VALUE="LIST-may" />
            <VARIABLE NAME="%%PEP-ARGUMENTS" VALUE="%%Argument" />
            <VARIABLE NAME="%%PEP-ACCOUNT" VALUE="VALUESSSS" />
            <VARIABLE NAME="%%Y2023" VALUE="G3" />
.........

I'm trying to get all VALUEs when the value of NAME begins with %%PEP-*.

I'm trying with the find method but it doesn't work.

for try in j_nodeOS.find(f"./VARIABLE[@NAME='']")

CodePudding user response：

You can check the line using startswith(). And, assuming you have the same format for all VARIABLE lines, I would just split the line until I find the needed value.

with open('path/to/file', 'r', encoding='utf-8') as file:
    for line in file:
        if line.strip().startswith('<VARIABLE NAME="%%PEP-'):
            print(line.split()[2].split("=")[1])

This would give you

"DDDD"
"LIST"
"LIST-may"
"%%Argument"
"VALUESSSS"

CodePudding user response：

It can be done with ElementTree, though it's a bit cumbersome:

import xml.etree.ElementTree as ET
doc = ET.fromstring([your xml above, well formed])
for v in doc.findall('.//VARIABLE'):
    if v.attrib['NAME'].startswith("%%PEP"):
        print(v.attrib['VALUE'])

It's far easier with lxml:

from lxml import etree
doc = etree.fromstring([your xml above, well formed])
doc2.xpath('//VARIABLE[starts-with(@NAME,"%%PEP")]/@VALUE')

and it should get you the same output:

DDDD
LIST
LIST-may
%%Argument
VALUESSSS