Home > database >  XSLT 3 | Hash Function
XSLT 3 | Hash Function

Time:02-02

We've been looking to generate a hash of a certain text from a given document and came up with the following version of XSLT:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:iway="http://iway.company.com/saxon-extension">
    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" exclude-result-prefixes="iway"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="*[not(descendant::text()[normalize-space()])]"/>

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>     
    </xsl:template>
    
    <xsl:template match="row" exclude-result-prefixes="iway">
    <xsl:variable name="jsonForHash" select="JSON_Output/text()"/>
    <xsl:variable name="iflExpression" select="concat('_sha1(''', $jsonForHash, ''')')"/>
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
            <CurrentDataHash type="12" typename="varchar"><xsl:value-of select="iway:ifl($iflExpression)"/></CurrentDataHash>   
            <Duplicity type="12" typename="varchar"><xsl:value-of select="$jsonForHash = LastDataHash/text()"/></Duplicity>     
        </xsl:copy>     
    </xsl:template>

</xsl:stylesheet>

...which does the job. The downside is that, it couldn't tested locally (on Altova/Stylus Studio) without modification and we would like to be able to do it. This is functional only in runtime that relies on Saxon-HE-9*. In an attempt to fix this, we gave the below version a shot (inspired from HERE):

<xsl:transform version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:digest="java?path=jar:file:///C:/libs/commons-codec-1.13.jar!/">
    <xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>

    <xsl:template match="/">
        <Output>
            <xsl:apply-templates mode="hash"/>
        </Output>
    </xsl:template>
    
    <xsl:template match="SKU_SEG" mode="hash">
        <Group>
            <xsl:variable name="val" select="."/>
            <xsl:copy-of select="$val"/>
            <xsl:variable name="hash-val" select="digest:org.apache.commons.codec.digest.DigestUtils.md5Hex($val)"/>
            <HashValue>
                <xsl:value-of select="$hash-val"/>
            </HashValue>
        </Group>
    </xsl:template>
    
</xsl:transform>

...which works only locally on Altova but does not work in runtime as we use Saxon-HE but the feature is supported only on Saxon-PE/EE. In order to overcome this, we came up with this version:

<xsl:transform version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:digest="java?path=jar:file:///C:/libs/commons-codec-1.13.jar!/" xmlns:iway="http://iway.company.com/saxon-extension" exclude-result-prefixes="digest iway">
    <xsl:output method="xml" indent="yes" omit-xml-declaration="yes" exclude-result-prefixes="digest iway"/>
    <xsl:template match="/">
        <Output>
            <xsl:apply-templates mode="hash"/>
        </Output>
    </xsl:template>
    <xsl:template match="SKU_SEG" mode="hash">
        <xsl:variable name="parserInfo" select="system-property('xsl:vendor')"/>
        <Group>
            <xsl:variable name="textForHash" select="."/>
            <xsl:variable name="iflExpression" select="concat('_sha1(''', $textForHash, ''')')"/>
            <xsl:copy-of select="$textForHash"/>
            <xsl:variable name="hashedVal">
                <xsl:choose>
                    <xsl:when test="contains(lower-case($parserInfo), 'saxon')">
                        <xsl:value-of select="iway:ifl($textForHash)"/>
                    </xsl:when>
                    <xsl:otherwise>
                        <xsl:value-of select="digest:org.apache.commons.codec.digest.DigestUtils.md5Hex($textForHash)"/>
                    </xsl:otherwise>
                </xsl:choose>
            </xsl:variable>
            <HashValue>
                <xsl:value-of select="$hashedVal"/>
            </HashValue>
        </Group>
    </xsl:template>
</xsl:transform>

...which works locally on Altova XMLSpy but not in runtime as Saxon complains the following:

net.sf.saxon.trans.XPathException: 
Cannot find a 1-argument function named 
Q{java?path=jar:file:///C:/libs/commons-codec-1.13.jar!/}
org.apache.commons.codec.digest.DigestUtils.md5Hex(). 
Reflexive calls to Java methods are not available under Saxon-HE

Now the question: Is it possible to achieve the requirement at all? Thanks in advance.

Setup Info: 
Runtime: Java Application relying on Saxon-HE
XSLT Versions Supported: 1/2/3
Standalone Tool for local tests: Altova XMLSpy

PS: The below version (inspired from HERE) appears to work both locally and remotely, if the text to be hashed is not too long, but the text that is being hashed here is too long, longer that what's permitted on an HTTP URL, thus is not an option:

<xsl:transform version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
    
    <xsl:template match="/">
        <Output>
            <arg0>
                <xsl:text disable-output-escaping="yes">&lt;![CDATA[</xsl:text>
                <xsl:copy>
                    <xsl:apply-templates/>
                </xsl:copy>
                <xsl:text disable-output-escaping="yes">]]&gt;</xsl:text>
            </arg0>
            <arg1>
                <xsl:apply-templates mode="hash"/>
            </arg1>
        </Output>
    </xsl:template>
    
    <xsl:template match="SKU_SEG">
        <xsl:copy-of select="."/>
    </xsl:template>
    
    <xsl:template match="SKU_SEG" mode="hash">
        <xsl:variable name="val" select="."/>
        <!-- delegate to an external REST service to calculate the MD5 hash of the value -->
        <xsl:variable name="hash-val" select="unparsed-text(concat('http://localhost/md5?text=', encode-for-uri($val)))"/>
        <!-- the response from this service is wrapped in quotes, so need to trim those off -->
        <xsl:value-of select="substring($hash-val, 2, string-length($hash-val) - 2)"/>
    </xsl:template>
    
</xsl:transform>

For reference, here is the Saxon extension function:

 private void registeriWayXsltExtensions_iFLEval(final XDDocument docIn) {
    log(".init() Registering iWay XSLT extensions...", "info");
    this.iway_xslt_extension_ifl = new ExtensionFunction() {
        public QName getName() {
          return new QName("http://iway.company.com/saxon-extension", "ifl");
        }
        
        public SequenceType getResultType() {
          return SequenceType.makeSequenceType(ItemType.STRING, OccurrenceIndicator.ONE);
        }
        
        public SequenceType[] getArgumentTypes() {
          return 
            new SequenceType[] { SequenceType.makeSequenceType(ItemType.STRING, OccurrenceIndicator.ONE) };
        }
        
        public XdmValue call(XdmValue[] arguments) throws SaxonApiException {
          String iflExpression = ((XdmAtomicValue)arguments[0].itemAt(0)).getStringValue();
          SaxonXsltAgent.this.log(".execute()  Received iFL Expression: "   iflExpression, "info");
          String iflResult = null;
          if (iflExpression != null && !iflExpression.equals(""))
            iflResult = XDUtil.evaluate(iflExpression, docIn, SaxonXsltAgent.this.getSRM()); 
          return (XdmValue)new XdmAtomicValue(iflResult);
        }
      };
    this.xsltProcessor.registerExtensionFunction(this.iway_xslt_extension_ifl);
    log(".execute() \"ifl\" registered.", "info");
  }

CodePudding user response:

If you want to call out to Java in SaxonJ-HE then you need to implement an "integrated extension function" and register it with the Saxon configuration, rather that relying on dynamic loading and reflexive invocation.

It's not difficult: see https://www.saxonica.com/documentation12/index.html#!extensibility/extension-functions-J/ext-simple-J

CodePudding user response:

I would try e.g.

<xsl:value-of 
  select="iway:ifl($textForHash)" 
  use-when="exists(function-lookup(QName('http://iway.company.com/saxon-extension', 'ifl'), 1))"/>

and

<xsl:value-of select="disest:org.apache.commons.codec.digest.DigestUtils.md5Hex($textForHash)" 
  use-when="exists(function-lookup(QName('java?path=jar:file:///C:/libs/commons-codec-1.13.jar!/', 'org.apache.commons.codec.digest.DigestUtils.md5Hex'), 1))"/>`.

Here is a complete example of that approach (that should work with Saxon HE 10 or later only, admittedly, as earlier HE versions didn't support higher-order functions):

package org.example;

import net.sf.saxon.s9api.*;

import javax.xml.transform.stream.StreamSource;

public class Main {
    public static void main(String[] args) throws SaxonApiException {
        Processor processor = new Processor(true);

        ExtensionFunction sqrt = new ExtensionFunction() {
            public QName getName() {
                return new QName("http://example.org/mf", "sqrt");
            }

            public SequenceType getResultType() {
                return SequenceType.makeSequenceType(
                        ItemType.DOUBLE, OccurrenceIndicator.ONE
                );
            }

            public SequenceType[] getArgumentTypes() {
                return new SequenceType[]{
                        SequenceType.makeSequenceType(
                                ItemType.DOUBLE, OccurrenceIndicator.ONE)};
            }

            public XdmValue call(XdmValue[] arguments) throws SaxonApiException {
                double arg = ((XdmAtomicValue)arguments[0].itemAt(0)).getDoubleValue();
                double result = Math.sqrt(arg);
                return new XdmAtomicValue(result);
            }
        };

        processor.registerExtensionFunction(sqrt);

        XsltCompiler xsltCompiler = processor.newXsltCompiler();

        Xslt30Transformer xslt30Transformer = xsltCompiler.compile(new StreamSource("sheet1.xsl")).load30();

        xslt30Transformer.callTemplate(null, xslt30Transformer.newSerializer(System.out));
    }
}

XSLT

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="3.0"
                xmlns:xs="http://www.w3.org/2001/XMLSchema"
                xmlns:mf="http://example.org/mf"
                xmlns:java-math="java:java.lang.Math"
                exclude-result-prefixes="#all"
                expand-text="yes">

    <xsl:mode on-no-match="shallow-copy"/>

    <xsl:output indent="yes"/>

    <xsl:template match="/" name="xsl:initial-template">
        <test>
            <integrated-extension-function>
                <xsl:value-of select="mf:sqrt(4)" use-when="exists(function-lookup(QName('http://example.org/mf', 'sqrt'), 1))"/>
            </integrated-extension-function>
            <reflexive-extension-function>
                <xsl:value-of select="java-math:sqrt(4)" use-when="exists(function-lookup(QName('java:java.lang.Math', 'sqrt'), 1))"/>
            </reflexive-extension-function>
        </test>
        <xsl:comment>Run with {system-property('xsl:product-name')} {system-property('xsl:product-version')} {system-property('Q{http://saxon.sf.net/}platform')}</xsl:comment>
    </xsl:template>

</xsl:stylesheet>

When run with the above Java code registering the function in Saxon HE 11 the result is e.g.

<?xml version="1.0" encoding="UTF-8"?>
<test>
   <integrated-extension-function>2</integrated-extension-function>
   <reflexive-extension-function/>
</test>
<!--Run with SAXON HE 11.4 -->

when running the XSLT through Saxon EE without registering the integrated extension function the output is

<?xml version="1.0" encoding="UTF-8"?>
<test>
   <integrated-extension-function/>
   <reflexive-extension-function>2</reflexive-extension-function>
</test>
<!--Run with SAXON EE 11.4 -->

So with the (xsl):use-when="exists(function-lookup(..))" you can conditionally inject code only while a certain function is available.

Sample project on Github: https://github.com/martin-honnen/SaxonHEIntegratedExtFnSample2

  • Related