I want to turn this airplane manual SGML/XML input, which contains revst/revend tags:
<em>
<revst/>
<prclist>
<title>This list of steps was added</title>
</prclist>
<prclist>
<title>Another list of steps was added</title>
</prclist>
<revend/>
<chapter>
<WARNING>
<revst/>
<PARA>First changed paragraph showing revst at deeper depth.</PARA>
</WARNING>
<PARA>Second changed paragraph showing revst at deeper depth.</PARA>
<revend/>
</chapter>
<listitem>
<revst/>
<PARA>First changed paragraph showing revst at higher depth</PARA>
<NOTE>
<PARA>Second changed paragraph showing revst at higher depth</PARA>
<revend/>
</NOTE>
</listitem>
<prclist>
<title>This list of steps was unchanged</title>
</prclist>
<para>
Some text
<revst/>and some changed text here.<revend/>
This text didn't change.
</para>
</em>
Into this:
<em>
<prclist revised="1">
<title revised="1">This list of steps was added</title>
</prclist>
<prclist revised="1">
<title revised="1">Another list of steps was added</title>
</prclist>
<chapter>
<WARNING>
<PARA revised="1">First changed paragraph showing revst at deeper depth.</PARA>
</WARNING>
<PARA revised="1">Second changed paragraph showing revst at deeper depth.</PARA>
</chapter>
<listitem>
<PARA revised="1">First changed paragraph showing revst at higher depth</PARA>
<NOTE revised="1">
<PARA revised="1">Second changed paragraph showing revst at higher depth</PARA>
</NOTE>
</listitem>
<prclist>
<title>This list of steps was unchanged</title>
</prclist>
<para>
Some text
<span revised="1">and some changed text here.</span>
This text didn't change.
</para>
</em>
Reason: I believe setting "revised" attribute on all tags (in a first processing pass) will make it easier to do the final HTML conversion in a second pass. If it's not easy/clean to do this pass in xsl 3, I will just write a program to do it.
The final goal is to have a background color set in HTML for all "revised" elements/text.
Assume that revst/revend pairs can not overlap each other, in the input document.
CodePudding user response:
Wow, this came out much cleaner than I thought, using an accumulator.
This stylesheet:
<xsl:stylesheet version="3.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
>
<xsl:mode use-accumulators="revisionCheck"/>
<xsl:template match="/">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="*">
<xsl:variable name="revised" select="accumulator-before('revisionCheck')"/>
<xsl:copy>
<xsl:if test="$revised = 1">
<!-- Add a "revised" attribute to this element -->
<xsl:attribute name="revised" select="$revised"></xsl:attribute>
</xsl:if>
<xsl:apply-templates>
<!-- Pass a parameter indicating if we are already inside a revised parent element.
This is useful for eliminating redundant <spans> in text nodes. -->
<xsl:with-param name="parent_revised" select="$revised"/>
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
<!-- Remove these tags from the output. -->
<xsl:template match="revst | revend">
</xsl:template>
<!--
Copy text. If it is revised and NOT already in a revised parent element, wrap it in a span.
-->
<xsl:template match="text()">
<xsl:param name="parent_revised" />
<xsl:variable name="revised" select="accumulator-before('revisionCheck')"/>
<xsl:choose>
<xsl:when test="$revised = 1 and $parent_revised != 1">
<span revised="{$revised}"><xsl:value-of select="."/></span>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="."/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<!--
Keep track of when we see revst and revend tags.
This seems to work even when revst is at a deeper level than the ending revend,
or vice-versa, yay!
Note that it doesn't matter what phase we use (start or end), because the tags
don't contain any children.
-->
<xsl:accumulator name="revisionCheck" as="xs:integer" initial-value="-1" >
<xsl:accumulator-rule match="revst" select="1"/>
<xsl:accumulator-rule match="revend" select="0"/>
</xsl:accumulator>
</xsl:stylesheet>
Produces this output, just what I wanted:
<?xml version="1.0" encoding="UTF-8"?>
<em>
<prclist revised="1">
<title revised="1">This list of steps was added</title>
</prclist>
<prclist revised="1">
<title revised="1">Another list of steps was added</title>
</prclist>
<chapter>
<WARNING>
<PARA revised="1">First changed paragraph showing revst at deeper depth.</PARA>
</WARNING>
<PARA revised="1">Second changed paragraph showing revst at deeper depth.</PARA>
</chapter>
<listitem>
<PARA revised="1">First changed paragraph showing revst at lower depth</PARA>
<NOTE revised="1">
<PARA revised="1">Second changed paragraph showing revst at lower depth</PARA>
</NOTE>
</listitem>
<prclist>
<title>This list of steps was unchanged</title>
</prclist>
<para>
Some text
<span revised="1">and some changed text here.</span>
This text didn't change.
</para>
</em>
CodePudding user response:
To identify nodes "inside" of <revst/>..<revend/>
you can use a nested for-each-group group-starting-with/group-ending-with
; with XSLT 3 you can store groups in a variable as a sequence of arrays and push that variable as a tunnel parameter through a mode that checks if nodes are part of a group and add the attribute:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="3.0"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="#all">
<xsl:template match="/*">
<xsl:variable name="rev-groups" as="array(node())*">
<xsl:for-each-group select="descendant::node()" group-starting-with="revst">
<xsl:if test="self::revst">
<xsl:for-each-group select="tail(current-group())" group-ending-with="revend">
<xsl:if test="current-group()[last()][self::revend]">
<xsl:sequence select="array{ current-group()[position() lt last()] }"/>
</xsl:if>
</xsl:for-each-group>
</xsl:if>
</xsl:for-each-group>
</xsl:variable>
<xsl:copy>
<xsl:apply-templates select="@*, node()">
<xsl:with-param name="rev-groups" tunnel="yes" select="$rev-groups"/>
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
<xsl:mode on-no-match="shallow-copy"/>
<xsl:template match="text()[normalize-space()]">
<xsl:param name="rev-groups" tunnel="yes"/>
<xsl:choose>
<xsl:when test=". intersect $rev-groups?1">
<span revised="1">
<xsl:next-match/>
</span>
</xsl:when>
<xsl:otherwise>
<xsl:next-match/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template match="*">
<xsl:param name="rev-groups" tunnel="yes"/>
<xsl:choose>
<xsl:when test=". intersect $rev-groups?*">
<xsl:copy>
<xsl:attribute name="revised" select="1"/>
<xsl:apply-templates select="@*, node()"/>
</xsl:copy>
</xsl:when>
<xsl:otherwise>
<xsl:next-match/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template match="revst | revend"/>
</xsl:stylesheet>
I think for elements to have the attribute revised
added the code works well, the code to wrap other nodes into a span revised
is probably not going to work as posted if comments or processing instructions occur as well. I am also not sure if that part of the requirement is clearly specified by the single example.