The XML doc I'm working with has the following structure:
<tok id="w-2279">del
<dtok form="de" id="d-2279-1" ord="20" lemma="de" xpos="SPC00"/>
<dtok form="el" id="d-2279-2" ord="21" lemma="el" xpos="DA0MS0"/>
</tok>
<tok id="w-2280" ord="22" lemma="sobredit" xpos="AQ0MS00">sobredit
</tok>
<tok id="w-2281" ord="23" lemma="," xpos="Fc">,
</tok>
I need to select the value of attribute 'lemma' in any last child of a 'tok' element that precedes a 'tok' element that has value starting with 'AQ' for the attribute 'xpos'.
I have tried with:
//tok[starts-with(@xpos, "AQ")]/preceding::tok/dtok[position()=1]
//tok[starts-with(@xpos, "AQ")]/preceding::tok/dtok[position()=last()]
//tok[starts-with(@xpos, "AQ")]/preceding::tok/dtok[1]
but the selected 'dtok' element is always the first child (i.e. the one with value 'de' for attribute 'lemma'.
What am I doing wrong? How does one specify that only the last one of all the children must be selected for value extraction?
CodePudding user response:
I think your intent with //tok[starts-with(@xpos, "AQ")]/preceding::tok/dtok[position()=last()]
can be shortened to //tok[starts-with(@xpos, "AQ")]/preceding::tok/dtok[last()]
and then should work fine and indeed does for me