Home > database >  Parse xml with special characters in GROOVY
Parse xml with special characters in GROOVY

Time:03-18

I want reads an xml and then I have to do insert on database. My problem is when a value contains special characters.

def xmlResponse = """<?xml version="1.0" encoding="UTF-8"?>
    <nm>
        <item>
            <Row>
               <cod>1</cod>
               <desc>RPAS <Management></desc>
            </Row>
            <Row>
               <cod>110</cod>
               <desc>FIGHTER3 & SIMULATION</desc>
            </Row>
       </item>
   <nm>"""

My code is:

  def parser = new XmlSlurper()
  def xmlPars = "${xmlResponse}".replaceAll("&", "&amp;")
  xmlPars2 = "${xmlPars}".replaceAll("<Management>", "&lt;" "<Management" "&gt;")
  def xml = parser.parseText("${xmlPars2}")

This works only for the string 'Management' and not for all cases, because if I do the replace of all '<' and '>' than the parser return an error.

Can you help me to write a code that works always?

I don't want escape these characters (if is possible) because my insert should contain the string as well as.

CodePudding user response:

it's not a valid xml, and what you are doing - trying to fix the result of incorrect xml formatting. better to fix place where you are building this xml...

however there is an easy way that could work for you:

def xmlResponse = """<?xml version="1.0" encoding="UTF-8"?>
    <nm>
        <item>
            <Row>
               <cod>1</cod>
               <desc>RPAS <Management></desc>
            </Row>
            <Row>
               <cod>110</cod>
               <desc>FIGHTER3 & SIMULATION</desc>
            </Row>
       </item>
   </nm>"""
//let's convert each <desc>...</desc>
//  to      <desc><![CDATA[...]]></desc>
//  then value inside CDATA does not require xml escaping
xmlResponse = xmlResponse.replaceAll('<desc>','<desc><![CDATA[').replaceAll('</desc>',']]></desc>')

def xml = new XmlSlurper().parseText(xmlResponse)
  • Related