I have many XMLs, which can contain a specific element, functioning as an identifier for the record. It's possible that it's missing from the file, or there is only one occurence, this is the easiest case, that I could solve already. But it's also possible that there are many of these elements, in that case I need to check if their values are the same. I attached some example XMLs with the expected output:
Input 1:
<root>
<daily>
<group>ID@12</group>
<surrounded>rich</surrounded>
<clothing>rod</clothing>
<outside>-1084855717</outside>
<section>-1103031959</section>
</daily>
<group>ID@13</group>
<account>remain</account>
<point>-1624875729</point>
<cotton>941344054.3731294</cotton>
<group>ID@12</group>
<scale>almost</scale>
</root>
Output: false, because there is three occurences, but one of them is different from the other 2.
Input 2:
<root>
<daily>
<group>ID@12</group>
<mill>spread</mill>
<surrounded>rich</surrounded>
<clothing>rod</clothing>
<outside>-1084855717</outside>
<section>-1103031959</section>
</daily>
<group>ID@12</group>
<account>remain</account>
<point>-1624875729</point>
<cotton>941344054.3731294</cotton>
<scale>almost</scale>
</root>
Output: true, because there are two occurences, and both are the same value.
I would have to use XPath expressions, but it could be also okay with XSLT file.
I could manage to find the relevant elements with the following expression, but I'm stuck at that point:
//group[starts-with(text(), 'ID@')]
CodePudding user response:
If //group[starts-with(text(), 'ID@')]
selects the elements I would shorten that to //group[starts-with(., 'ID@')]
first and then use count(distinct-values(//group[starts-with(., 'ID@')])) = 1
, for instance. That assumes XPath 2 and distinct-values
is available, as your XQuery tag suggests.
In both XSLT or XQuery it could of course be also considered a grouping of those elements by their own value and checking that there is exactly one group.
CodePudding user response:
This may look strange, but I will explain it:
not(//group[starts-with(text(), 'ID@')] != //group[starts-with(text(), 'ID@')])
The expression selects the set of ID nodes and compares it with the same set of ID nodes using the !=
operator. You might think that $x != $x
would always return a false value, but that's not how it works, actually. See the section on boolean expressions in the XPath 1.0 specification:
If both objects to be compared are node-sets, then the comparison will be true if and only if there is a node in the first node-set and a node in the second node-set such that the result of performing the comparison on the string-values of the two nodes is true.
In other words, if any pair of elements from the first node-set and the second node-set satisfy the comparison operator (!=
, in our case), then the expression returns true.
In our case, if you have a node-set $x
and you evaluate the expression $x != $x
it will return true if there is a member of $x
which is not equal to some (other) member of $x
.
Using the not()
function you can then negate the result, e.g. not($x != $x)
will return false if the members of $x are not all equal to each other.