I read a file using
File.ReadAllText(..., Encoding.ASCII);
According the documentation [MSDN] (emphasis mine),
This method attempts to automatically detect the encoding of a file based on the presence of byte order marks. Encoding formats UTF-8 and UTF-32 (both big-endian and little-endian) can be detected.
However, in my case the ASCII file incorrectly started with 0xFE 0xFF
and it detected UTF-16 (probably big endian, but I did not check).
CodePudding user response:
According to File
[referencesource] it uses a StreamReader:
private static String InternalReadAllText(String path, Encoding encoding, bool checkHost)
{
...
using (StreamReader sr = new StreamReader(path, encoding, true, StreamReader.DefaultBufferSize, checkHost))
return sr.ReadToEnd();
}
and that StreamReader overload with 5 parameter [MSDN] is documented to support UTF-16 as well
It automatically recognizes UTF-8, little-endian Unicode, big-endian Unicode, little-endian UTF-32, and big-endian UTF-32 text if the file starts with the appropriate byte order marks. Otherwise, the user-provided encoding is used.
(emphasis mine)
Since File.ReadAlltext()
is supposed to and documented to detect Unicode BOMs, it's probably a good idea that it detects UTF-16 as well. However, the documentation is wrong and should be updated. I filed issue #7515.