I want to extract a value of string which is part of an XML structure with XSLT. Therefore I need to get the word in front of the colon as a node name and the word after the colon as a value for this node. The node name will be the same in every document, but the value will be various so I thought about using wildcards for extracting the value, but I didn't find out how to do that. Can you help me maybe?
<MxML>
<mail>
<body>
Fruit: apple
Vagetable: potato
Animal: dog
</body>
</mail>
</MxML>
So the result should look like:
<MxML>
<mail>
<Fruit>apple</Fruit>
<Vagetable>potato</Vagetable>
<Animal>dog</Animal>
</mail>
</MxML>
I'm working with XSLT 2.0
CodePudding user response:
Here is one way you could look at it:
XSLT 2.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<!-- identity transform -->
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="mail">
<xsl:copy>
<xsl:for-each select="tokenize(body, ' ')[normalize-space()]">
<xsl:element name="{substring-before(., ': ')}">
<xsl:value-of select="substring-after(., ': ')"/>
</xsl:element>
</xsl:for-each>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Here's another:
<xsl:template match="mail">
<xsl:copy>
<xsl:analyze-string select="body" regex="^(. ): (. )$" flags="m">
<xsl:matching-substring>
<xsl:element name="{regex-group(1)}">
<xsl:value-of select="regex-group(2)"/>
</xsl:element>
</xsl:matching-substring>
</xsl:analyze-string>
</xsl:copy>
</xsl:template>
Note that both assume that the first part of each name/value pair is a valid element name.
CodePudding user response:
you can use tokenize function in template matching mail normalize body text before
<xsl:template match="mail">
<xsl:variable name="strvalue" select="replace(./body/text(), '(^\n\s )|(\n\s $)', '')"/>
<xsl:variable name="strvalue" select="replace($strvalue, '\n\s ', '#')"/>
<xsl:copy>
<xsl:for-each select="tokenize($strvalue, '#')">
<xsl:variable select="tokenize(., ': ')" name="values"/>
<xsl:element name='{$values[1]}'>
<xsl:value-of select="$values[2]"/>
</xsl:element>
</xsl:for-each>
</xsl:copy>
</xsl:template>
this part
<xsl:variable name="strvalue" select="replace(./body/text(), '(^\n\s )|(\n\s $)', '')"/>
<xsl:variable name="strvalue" select="replace($strvalue, '\n\s ', '#')"/>
transforms body text into string where lines are separated by # and save it in variable. the string from body then looks like
Fruit: apple#Vagetable: potato#Animal: dog