There was a problem with encodings when writing to a Russian language file in C#. What is the point: I have a string with Russian and English characters encoded in UTF8, I write it to a file in two ways:
using (StreamWriter sw = new StreamWriter(path, false, Encoding.UTF8))
{
await sw.WriteLineAsync(stringContent);
}
in this case, everything is fine in the file, there are both Russian and English characters, notepad defines the encoding as UTF 8 with BOM.
using (StreamWriter sw = File.CreateText(path))
{
await sw.WriteLineAsync(stringContent);
}
In this case, the file has English characters and porridge instead of Russian ones, notepad defines the encoding as UTF8 without BOM.
File.CreateText()
return StreamWriter
with Encoding.UTF8
.
Question: why if I explicitly specify UTF8 encoding everything works, but if it is used by default in File.CreateText()
, then Russian characters turn into a mess? Problem with BOM symbols?
CodePudding user response:
Can this link help you? Why StreamWriter writes text to file with using UTF-8 without BOM?
It seems like it is the default behavior of StreamWriter.
StreamWriter defaults to using an instance of UTF8Encoding unless specified otherwise. This instance of UTF8Encoding is constructed without a byte order mark (BOM), so its GetPreamble method returns an empty byte array.