I have a string like this:
AA 12345678910
BB TESTTESTTEST
BB TESTTESTTEST
BB TESTTESTTEST
CC TEST
AA 0897654321
BB TESTTESTTEST
CC TEST
How would i group by data AA? This is just string by the way. I can do this by positioning but data BB's are multi occurring.
Is it possible to tokenize a chunk of string. In a sentence: "Group by AA until another AA shows up"
CodePudding user response:
Assuming this input:
<input>
AA 12345678910
BB TESTTESTTEST
BB TESTTESTTEST
BB TESTTESTTEST
CC TEST
AA 0897654321
BB TESTTESTTEST
CC TEST
</input>
and this XSLT
<xsl:for-each select="tokenize(input, '^AA ', 'm')">
<xsl:if test="normalize-space()">
<block>AA <xsl:value-of select="." /></block>
</xsl:if>
</xsl:for-each>
we get two blocks:
<block>AA 12345678910
BB TESTTESTTEST
BB TESTTESTTEST
BB TESTTESTTEST
CC TEST
</block><block>AA 0897654321
BB TESTTESTTEST
CC TEST
</block>
tokenize()
splits the input string at a delimiter, but it removes the delimiter in the process. That's why we need to add the 'AA '
back manually in the output.
CodePudding user response:
In XSLT 3 (supported since 2017 and by Saxon 9.8 and later, Saxon-JS 2, Altova XML 2017 R3 and later) you can use for-each-group group-starting-with
on a sequence of strings:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="#all"
version="3.0">
<xsl:output indent="yes"/>
<xsl:template match="data">
<xsl:copy>
<xsl:for-each-group select="tokenize(., '\n')[normalize-space()]" group-starting-with=".[starts-with(., 'AA')]">
<group pos="{position()}">
<xsl:apply-templates select="current-group()"/>
</group>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
<xsl:template match=".[. instance of xs:string]">
<xsl:element name="{substring(., 1, 2)}"/>
</xsl:template>
</xsl:stylesheet>
https://xsltfiddle.liberty-development.net/6qaHaS5
One way in XSLT 2, to use for-each-group
similar to the above, would be to first transform the text lines into XML elements.