XSL style: using <xsl:for-each select> or <xsl:template match> or other solutions in XSL-CodePudding

I'm learning to use XSL to parse XML into HTML/XHTML.

The XLST <xsl:for-each> element is a core element of the language that allows looping. However posts here and elsewhere suggest using this is common for beginners (which I am!) and is poor style.

My question is: what are better (as in more efficient / elegant / better style) options to <xsl:for-each> loops and why?

In the example below I used nested <xsl:for-each> and <xsl:choose> elements to loop through the required nodes with a conditional <xsl:when> test. This works okay and selects the nodes I need, but does feel rather clunky...

Your wisdom and insights would be greatly appreciated!

My example XML is a report generated by a Stanford HIVdb database query: https://hivdb.stanford.edu/hivdb/by-sequences/

XSD schema is here: https://hivdb.stanford.edu/DR/schema/sierra.xsd

My example XML report is here: https://github.com/delfair/xml_examples/blob/main/Resistance_1636677016671.xml

My example XSLT:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="/">

<html>
<head>
    <title>Example Report</title>
</head>
<body>

<h3>Significant mutations</h3>

<xsl:for-each select=".//geneData">
    <xsl:choose>
        <xsl:when test="gene='HIV1PR'">
        Protease inhibitor mutations<br/><br/>
        </xsl:when>
        <xsl:when test="gene='HIV1RT'">
        Reverse transcriptase inhibitor mutations<br/><br/>
        </xsl:when>
        <xsl:when test="gene='HIV1IN'">
        Integrase inhibitor mutations<br/><br/>
        </xsl:when>
    </xsl:choose>
<table>
<xsl:for-each select=".//mutation">
    <xsl:choose>
        <xsl:when test="classification='PI_MAJOR' or classification='PI_MINOR' or classification='NRTI' or classification='NNRTI' or classification='INI_MAJOR' or classification='INI_MINOR'">
        <tr>
        <td>Class</td>
        <td>Mutation</td>
        </tr>
        <tr>
            <td><xsl:value-of select="classification"/></td>
            <td><xsl:value-of select="mutationString"/></td>
        </tr>
        </xsl:when>
    </xsl:choose>
</xsl:for-each>
</table><br/>
</xsl:for-each>

</body>
</html>

</xsl:template>
</xsl:stylesheet>

Resulting HTML:

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Example Report</title>
</head>
<body>
<h3>Significant mutations</h3>
Protease inhibitor mutations<br><br><table></table>
<br>
Reverse transcriptase inhibitor mutations<br><br><table>
<tr>
<td>Class</td>
<td>Mutation</td>
</tr>
<tr>
<td>NNRTI</td>
<td>K103N</td>
</tr>
</table>
<br>
Integrase inhibitor mutations<br><br><table></table>
<br>
</body>
</html>

CodePudding user response：

First of all, xsl:for-each is NOT used for "looping". It has no exit condition and there is no way to pass the result of one iteration to another.

Next, using xsl:for-each is NOT limited to beginners, nor is it "poor style" - despite what you might have read here or anywhere else.

The xsl:for-each instruction is no more than a shortcut used in a special case. The general approach works in two stages:

first, you select a set of nodes and tell the processor to apply templates to them;
next, the processor finds the template that best matches each node in the selected node-set and applies it.

In the case where (1) you want to apply uniform processing to all nodes in the selected set and (2) there is no need to apply the template recursively or re-use it otherwise, you can simply tell the processor: take these nodes and apply this template to them.

And that's exactly what the xsl:for-each instruction does. It has a select attribute to select the nodes to process and its content is a template to be applied to the selected node-set. Nothing more, nothing less. There is no paradigm shift here. There is no "push style" vs. "pull-style". There is no good vs. evil.

The only problem with xsl:for-each arises when it is the only tool in the stylesheet author's toolbox. As I said, it is a shortcut that can be used in special circumstances. When those circumstances do not apply, using it leads to very poor code.

CodePudding user response：

I guess what you mean by "advanced style" is using templates (that do pattern matching) instead of xsl:for-each "loops".

The core functionality of your code could be transformed as follows:

...
      <h3>Significant mutations</h3>

      <xsl:apply-templates select=".//geneData" />
      <table>
        <xsl:apply-templates select=".//mutation" />
      </table><br/>
    </body>
  </html>
</xsl:template>
<!-- End of main template matching "/" -->

<xsl:template match="geneData[gene='HIV1PR']">Protease inhibitor mutations<br/><br/></xsl:template>
<xsl:template match="geneData[gene='HIV1RT']">Reverse transcriptase inhibitor mutations<br/><br/></xsl:template>
<xsl:template match="geneData[gene='HIV1IN']">Integrase inhibitor mutations<br/><br/></xsl:template>
<xsl:template match="geneData" />   <!-- Discard all not matched 'geneData' elements -->    

<xsl:template match="mutation[classification='PI_MAJOR' or classification='PI_MINOR' or classification='NRTI' or classification='NNRTI' or classification='INI_MAJOR' or classification='INI_MINOR']">
        <tr>
            <td>Class</td>
            <td>Mutation</td>
        </tr>
        <tr>
            <td><xsl:value-of select="classification"/></td>
            <td><xsl:value-of select="mutationString"/></td>
        </tr>
</xsl:template>
<xsl:template match="mutation" />   <!-- Discard all not matched 'mutation' elements -->    

</xsl:stylesheet>

In the above code both sets of nodes (.//geneData and .//mutation) are passed to xsl:apply-templates which passes the resulting nodes to all the templates. And those who match are executed. Hence the short xsl:template's with the predicates (the [...] parts of the match="...") which replace the xsl:whens of your code.

This is supposedly the "standard" approach of XSLT development. In practice there are use-cases where xsl:for-each may be preferable for code clarity, but generally both approaches are interchangeable.