Home > Software engineering >  JsonNode.Parse: error parsing text with accents
JsonNode.Parse: error parsing text with accents

Time:12-22

I am trying to parse a Latin text with the Parse method of JsonNode from System.Text.Json class.

But when the text contains accents, the method returns escape characters.

var jsonString = File.ReadAllText(path, Encoding.GetEncoding(1252));                   
var jTemplate = JsonNode.Parse(jsonString);

The object "jsonString" contain the right text (with accents) but when I call JsonNode.Parse the object "jTemplate" contains the bad text

"Ciberseguridad en la organización" in jsonString

"Ciberseguridad en la organizaci\u00F3n" in jTemplate

I have also tried other encoding and code page, for example UTF8 with the same results...

¿Any idea how to parse text with accents?

Thanks in advance.

CodePudding user response:

For the moment JsonNode.Parse() doesn't provide a way to set the Encoder similar to JsonSerializer.

You have two options:

  1. Use JsonSerializer instead and follow the tips from the link above.

  2. Unescape the string value after parsing it using the JsonNode:

    var expectedValue = Regex.Unescape(jTemplate["key"].ToString());
    

CodePudding user response:

I can offer you to use JsonSerializer.Deserialize method which accept JsonSerializerOptions object where you can set Encoder.

The output of my code sample is:

Ciberseguridad en la organización

using System.Text.Encodings.Web;
using System.Text.Json;
using System.Text.Unicode;

string jsonString = "{\"data\": \"Ciberseguridad en la organización\"}";
JsonSerializerOptions options = new JsonSerializerOptions()
{
    Encoder = JavaScriptEncoder.Create(UnicodeRanges.All)
};
DataDto? jTemplate = JsonSerializer.Deserialize<DataDto>(jsonString, options);
Console.WriteLine(jTemplate.data);

class DataDto
{
    public string data { get; set; }
}
  • Related