Home > Back-end >  XSL transform failing when ampersand in XML string
XSL transform failing when ampersand in XML string

Time:03-09

When I try to transform XML containing & (ambersand) , below error is coming , how to escape it.

Error :

Unable to generate the XML document using the provided XML/XSL input. org.xml.sax.SAXParseException; lineNumber: 4; columnNumber: 61; The reference to entity "LOB" must end with the ';' delimiter. You most likely forgot to escape '&' into '&'

Input XML:

<?xml version="1.0" encoding="UTF-8"?>
<store> <!-- Root Element -->
    <book id ="5350192956">
        <bookname>https://test.com/logon.jsp?fromLoc=ALL&LOB=COLLogon</bookname> 
        <authorname>Michael Kay</authorname>
        <publisher>Wrox</publisher>
        <price>$40</price> 
        <edition>4th</edition>         
    </book> 
    <book id ="3741122298">
        <bookname>Head First Java</bookname> 
        <authorname>Kathy Sierra</authorname>
        <publisher>O'reilly</publisher>
        <price>$19</price> 
        <edition>1st</edition>         
    </book>
    <book id ="9987436700">
        <bookname>SQL The Complete Reference</bookname> 
        <authorname>James R. Groff</authorname>
        <publisher>McGraw-Hill</publisher>
        <price>$45</price> 
        <edition>3rd</edition>         
    </book>
</store>

XSL :

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="xs"
    version="2.0"  >          
        <xsl:template match = "/">           
            <html>   
                <body>  
                    <h2>Books:-</h2>   
                    <table border = "1">   
                        <tr bgcolor = "#cd8932">   
                            <th>Book ID</th>   
                            <th>Book Name</th>   
                            <th>Author Name</th>   
                            <th>Publisher</th>   
                            <th>Price</th>                           
                            <th>Edition</th>
                        </tr>                        
                        <xsl:for-each select="store/book">   
                            <tr bgcolor = "#84cd32">   
                                <td><xsl:value-of select = "@id"/></td>   
                                <td><xsl:value-of disable-output-escaping="yes" select = "bookname" /></td>   
                                <td><xsl:value-of select = "authorname"/></td>   
                                <td><xsl:value-of select = "publisher"/></td>   
                                <td><xsl:value-of select = "price"/></td>                               
                                <td><xsl:value-of select = "edition"/></td>
                            </tr>   
                        </xsl:for-each>   
                    </table>   
                </body>   
            </html>   
        </xsl:template>   
</xsl:stylesheet>

CodePudding user response:

This XML element contains an unescaped & character:

<bookname>https://test.com/logon.jsp?fromLoc=ALL&LOB=COLLogon</bookname> 

It should be encoded like this:

<bookname>https://test.com/logon.jsp?fromLoc=ALL&amp;LOB=COLLogon</bookname> 

CodePudding user response:

There's no Almost-XML standard so XML tools are out. Which is why the input file should be rejected.

But if deadlines are tight you could do worse than try (which strips the XML declaration here):

xmllint --recover --html  --xmlout --dropdtd --xpath 'html/body/*' file.xml  2>/dev/null |
diff --ignore-all-space --context=1 file.xml -
  • --recover output any parsable portions
  • --html use the HTML parser
  • --xpath '…' strips the HTML wrapper
  • 2>/dev/null discards errors from the HTML parsing
  • Related