Home > Blockchain >  Jackson XmlMapper - JsonParseException: Unexpected character '&'
Jackson XmlMapper - JsonParseException: Unexpected character '&'

Time:12-14

I am trying to parse an XML string to a Java object using fasterxml.jackson.xml.XmlMapper.

The problem is that the XML string contains the character '&'.

I am getting an exception thrown

Exception in thread "main" com.fasterxml.jackson.databind.JsonMappingException: Unexpected character '&' in prolog; expected '<'.

Code

import java.util.Map;
import com.fasterxml.jackson.dataformat.xml.XmlMapper;

public class MyProblem {
   public static void main(String[] args) {
      XmlMapper = xmlMapper = new XmlMapper();
      String myXML = "<cookies>Chocolate&Butter cocunut</cookies>";  
      Map<String, String> myTester = xmlMapper.reader().readValue(myXML, Map.class);
   }
}

I was expecting it to work when I perform a System.out.println(myTester);

After reading XmlMapper's documentation, I believe there is a property I can set that I can use to override deserialization functionalities.

If I need to escape these special characters, how to do?

CodePudding user response:

Because of the special role of ampersand character in XML it must be

  • either enclosed as CDATA "<cookies><![CDATA[Chocolate&Butter cocunut]]></cookies>"
  • or as HTML-entity "<cookies>Chocolate&amp;Butter cocunut</cookies>"

Both would be valid XML strings that Jackson and the underlying Woodstox can parse.

See also XML Spec, 2.4 Character Data and Markup:

The ampersand character (&) and the left angle bracket (<) MUST NOT appear in their literal form, except when used as markup delimiters, or within a comment, a processing instruction, or a CDATA section. If they are needed elsewhere, they MUST be escaped using either numeric character references or the strings " & " and " < " respectively. The right angle bracket (>) may be represented using the string " > ", and MUST, for compatibility, be escaped using either " > " or a character reference when it appears in the string " ]]> " in content, when that string is not marking the end of a CDATA section.

Related questions:

CodePudding user response:

Jackson provides a number of ways to escape special characters when serializing and deserializing. You can use the JsonParser.Feature.ALLOW_BACKSLASH_ESCAPING_ANY_CHARACTER feature to allow backslash escaping of any character. You can also use the JsonParser.Feature.ALLOW_UNQUOTED_CONTROL_CHARS feature to allow unquoted control characters.

More details can be found here

  • Related