I am trying to parse a Latin text with the Parse method of JsonNode from System.Text.Json class.
But when the text contains accents, the method returns escape characters.
var jsonString = File.ReadAllText(path, Encoding.GetEncoding(1252));
var jTemplate = JsonNode.Parse(jsonString);
The object "jsonString" contain the right text (with accents) but when I call JsonNode.Parse the object "jTemplate" contains the bad text
"Ciberseguridad en la organización" in jsonString
"Ciberseguridad en la organizaci\u00F3n" in jTemplate
I have also tried other encoding and code page, for example UTF8 with the same results...
¿Any idea how to parse text with accents?
Thanks in advance.
CodePudding user response:
For the moment JsonNode.Parse()
doesn't provide a way to set the Encoder
similar to JsonSerializer.
You have two options:
Use
JsonSerializer
instead and follow the tips from the link above.Unescape the string value after parsing it using the
JsonNode
:var expectedValue = Regex.Unescape(jTemplate["key"].ToString());
CodePudding user response:
I can offer you to use JsonSerializer.Deserialize method which accept JsonSerializerOptions object where you can set Encoder.
The output of my code sample is:
Ciberseguridad en la organización
using System.Text.Encodings.Web;
using System.Text.Json;
using System.Text.Unicode;
string jsonString = "{\"data\": \"Ciberseguridad en la organización\"}";
JsonSerializerOptions options = new JsonSerializerOptions()
{
Encoder = JavaScriptEncoder.Create(UnicodeRanges.All)
};
DataDto? jTemplate = JsonSerializer.Deserialize<DataDto>(jsonString, options);
Console.WriteLine(jTemplate.data);
class DataDto
{
public string data { get; set; }
}