Home > Blockchain >  Removing both single and double quotes using XSLT translate function or better approach
Removing both single and double quotes using XSLT translate function or better approach

Time:10-17

I am working with XSLT 1.0 and XML. I am new to the topics but I have been reading and trying out how XSLT is applied to XML. Now, I have given a project where I need to filter out invalid fields from XML element. The java Transformer class is used to apply the XSLT on the XML. The java code is similar to the oracle tutorial page under section "Writing an XSLT Transform". I have added the below xml and xsl and run the code like described in the tutorial page.My target is to extract single and bouble quotes and the following characters #60;•^#x6;

The XML file

<?xml version="1.0" encoding="UTF-8"?>
<Author>
    <Name>
        <FirstName>Ch#60;•^#x6;'""ris</FirstName>
        <LastName>Banville</LastName>
    </Name>
</Author>

The XSL file

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:template match="/">
        <html>
            <body>
                 <xsl:variable name="invalid">#60;•^#x6;"'</xsl:variable>
                 <div>
                 <xsl:value-of select="translate(/Author/Name/FirstName,$invalid,'')" />
                 </div>
            </body>
        </html>
    </xsl:template>
</xsl:stylesheet>

The end output is

  <body>
     <div>Chris</div>
   </body>

My question is what is the right approach while using XSLT 1.0 to escape both single and double quotes and other invalid characters together? I have also tried nesting

translate

XSLT function as shown below after creating a "single" and "double" variable.

<xsl:value-of select="translate(translate(/Author/Name/FirstName,$single,''),$double,'')" />

I am still not sure pros and cons using any of my current implemenation or even if it the right approach to perform such tasks.

CodePudding user response:

The XPath 1.0 specification states:

Within expressions, literal strings are delimited by single or double quotation marks, which are also used to delimit XML attributes. To avoid a quotation mark in an expression being interpreted by the XML processor as terminating the attribute value the quotation mark can be entered as a character reference (&quot; or &apos;). Alternatively, the expression can use single quotation marks if the XML attribute is delimited with double quotation marks or vice-versa.

This means that within a single expression using only literal string, you can either remove double quote characters by:

<xsl:value-of select="translate(input, '&quot;'')"/>

or single quote (apostrophe) characters by:

<xsl:value-of select='translate(input, "&apos;", "")'/>

but not both.

Defining a variable containing all the unwanted characters as literal text (as you did) is probably the best way around this limitation.

  • Related