Home > database >  XSLT (2.0 or 3.0) way of duplicating whole xml for each comma separated value stored in xml into sep
XSLT (2.0 or 3.0) way of duplicating whole xml for each comma separated value stored in xml into sep

Time:10-07

I have an XML file that is a workOrder from a web order system. It contains a lot of order data and one of the values is a delimited string of multiple file paths. I would like to duplicate the full XML and output one xml for each fileURL and swap out the value with each fileURL (a single file path in each xml). Reason is that th workflow system used later on reads the path to the file and picks it up and associates the xml as metadata for further processing, but one xml is needed per file).

Input XML (the part containing the stored paths):

<rootNode> 
... 
<properties>
<property>
<name label='fileURL'>fileurl</name>
<value>\\nas02\Order\O10346_OP176786_X1.pdf, \\nas02\Order\Weborder\O10346_OP176789_X2.pdf, \\nas02\Order\Weborder\O10346_OP176795_X3.pdf, \\nas02\Order\Weborder\O10346_OP176796_X1.pdf,
</value>   
</property>   
</properties> 
</technicalSpec> 
... 
</rootNode>

Expected output would be one xml for each fileURL containing the same data, except the property value should be the single fileURL for each copy:

<rootNode> 
    ... 
    <properties>
    <property>
    <name label='fileURL'>fileurl</name>
    <value>\\nas02\Order\O10346_OP176786_X1.pdf
    </value>   
    </property>   
    </properties> 
    </technicalSpec> 
    ... 
    </rootNode>

I know how to get the csv string into a variable:

<xsl:variable name="csv" select="//property[name='fileurl']/value"></xsl:variable>

I have found that i can do a for-each loop for the values:

<xsl:for-each select="tokenize($csv, ',')">

I also found how i can copy whole xml content:

<xsl:template match="@*|node()">
            <xsl:copy>
                <xsl:apply-templates select="@*|node()"/>
            </xsl:copy>
    </xsl:template>

And i know i can use "result-document" in a for-each loop to create separate output files.

But i cannot figure out how to combine everything into a working xslt (if possible) to create one xml per csv value.

CodePudding user response:

This problem is structurally identical to the question at Split google XML items by param value though I'll refrain from marking it as a duplicate because it might not be obvious to a beginner how to convert the answer to that question to your needs.

The essence of that approach is:

<xsl:mode on-no-match="shallow-copy"/>

<xsl:template match="/">
  <xsl:variable name="root" select="/*"/>
  <xsl:for-each select="tokenize(//value, ',')!normalize-space()">
     <xsl:result-document href="{position()}.xml">
       <xsl:apply-templates select="$root">
         <xsl:with-param name="current-file" select="."/>
       </xsl:apply-templates>
     </xsl:result-document>
  </xsl:for-each>
</xsl:template>

<xsl:template match="value">
  <xsl:param name="current-file"/>
  <value>{$current-file}</value>
</xsl:template>

Note that this depends on the fact that the built-in template rules copy parameter values through unchanged (they effectively behave like tunnel parameters). Of course you could also declare it as a tunnel parameter explicitly.

CodePudding user response:

One way to achieve this is using a variable that is filled by xsl:apply-templates with a mode attribute. The first steps are as you suspected, but to change one element in the resulting document is a bit more tricky.

In this approach, first, I make a copy of the input document with the line

<xsl:variable name="doc" select="/" />

A copy of the filenames with path is made - as you already proposed:

<xsl:variable name="csv" select="//property[name='fileurl']/value" />

Here, the xsl:for-each is applied.
As an output filename I simple chose the last part of the current part (of this iteration) of the $csv string:

<xsl:variable name="result-name" select="string-join(tokenize(., '\\')[position() = last()], '')" />

Then I use a variable whose value is filled by an apply-templates with the mentioned mode="new" attribute; being applied to the templates with this mode; one of them changing the value as set in the parameter given by xsl:param:

<xsl:variable name="new-doc">
    <xsl:apply-templates select="$doc" mode="new">
        <xsl:with-param name="nam" select="normalize-space(.)" />
    </xsl:apply-templates>
</xsl:variable>

Now, the two templates with the mode="new" attribute are executed.
And, finally, the variable is written with xsl:result-document to the corresponding document:

<xsl:result-document encoding="UTF-8" href="{$result-name}">
    <xsl:copy-of select="$new-doc" />
</xsl:result-document>

The whole stylesheet could look like this:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" omit-xml-declaration="yes" indent="yes"/>

    <xsl:variable name="doc" select="/" />
    <xsl:variable name="csv" select="//property[name='fileurl']/value" />

    <xsl:template match="/">
        <xsl:for-each select="tokenize($csv, ',')">
            <xsl:variable name="result-name" select="string-join(tokenize(., '\\')[position() = last()], '')" />            
            <xsl:variable name="new-doc">
                <xsl:apply-templates select="$doc" mode="new">
                    <xsl:with-param name="nam" select="normalize-space(.)" />
                </xsl:apply-templates>
            </xsl:variable>
            <xsl:result-document encoding="UTF-8" href="{$result-name}">
                <xsl:copy-of select="$new-doc" />
            </xsl:result-document>
        </xsl:for-each>
    </xsl:template>

    <xsl:template match="value" mode="new">
        <xsl:param name="nam" />
        <value><xsl:value-of select="$nam" /></value>
    </xsl:template>

    <!-- identity template -->
    <xsl:template match="node()|@*" mode="new">
        <xsl:param name="nam" />
        <xsl:copy>
            <xsl:apply-templates select="node()|@*" mode="new">
                <xsl:with-param name="nam" select="$nam" />
            </xsl:apply-templates>
        </xsl:copy>
    </xsl:template> 

</xsl:stylesheet>

CodePudding user response:

I would use a tunnel parameter and xsl:mode, given XSLT 3:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="3.0"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  exclude-result-prefixes="#all"
  expand-text="yes">

  <xsl:output method="xml" indent="yes"/>
  
  <xsl:template match="/">
    <xsl:variable name="main-root" select="/"/>
    <xsl:for-each select="tokenize(rootNode/technicalSpec/properties/property[name[@label = 'fileURL' and . = 'fileurl']]/value, ',s*')[normalize-space()]">
      <xsl:result-document href="{substring-before(., '.pdf')}.xml">
        <xsl:apply-templates select="$main-root/*">
          <xsl:with-param name="url" select="." tunnel="yes"/>
        </xsl:apply-templates>        
      </xsl:result-document>
    </xsl:for-each>
  </xsl:template>
  
  <xsl:template match="property[name[@label = 'fileURL' and . = 'fileurl']]/value">
    <xsl:param name="url" tunnel="yes"/>
    <xsl:copy>{$url}</xsl:copy>
  </xsl:template>

  <xsl:mode on-no-match="shallow-copy"/>

</xsl:stylesheet>

CodePudding user response:

There are already three answers to this question, and they all do the same thing: tokenize the value element, create a result-document for each token and apply templates with the current token as the parameter intended for the template matching value.

I would suggest a different approach, which I believe is simpler:

XSLT 3.0

<xsl:stylesheet version="3.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
expand-text="yes">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<xsl:mode on-no-match="shallow-copy"/>

<xsl:template match="/">
  <xsl:variable name="root" select="."/>
    <xsl:analyze-string select="//value" regex="(. ?)($|, )">
        <xsl:matching-substring>
            <xsl:result-document href="{position()}.xml">
                <xsl:apply-templates select="$root/*"/>
            </xsl:result-document>
        </xsl:matching-substring>
    </xsl:analyze-string>
</xsl:template>

<xsl:template match="value">
    <xsl:copy>{regex-group(1)}</xsl:copy>
</xsl:template>

</xsl:stylesheet>

Note that I have assumed a properly delimited string in the form of:

<value>\\nas02\Order\O10346_OP176786_X1.pdf, \\nas02\Order\Weborder\O10346_OP176789_X2.pdf, \\nas02\Order\Weborder\O10346_OP176795_X3.pdf, \\nas02\Order\Weborder\O10346_OP176796_X1.pdf</value>   

Demo (simulated): https://xsltfiddle.liberty-development.net/3MP42Pb

  • Related