I have a problem converting byte array to string in right format. Im reading byte array over TCP socket, it gives me bytes, one of the bytes is byte 158. If i read string with:
Encoding.Latin1.GetString(data)
it gives me string in format "blahblah\u009eblahblah". \u009e is the code for letter ž. The sting i need should be "blahblahžblahblah". How i can get the string in the right format?
Alredy tried other encodings like ACSII, UTF8 etc.. none of them got me the right format.
EDIT some code example how im getting the data and what im doing with it:
TcpClient client = new TcpClient(terminal.server_IP, terminal.port);
NetworkStream stream = client.GetStream();
stream.ReadTimeout = 2000;
string message = "some message for terminal";
byte[] msg = Encoding.Latin1.GetBytes(message);
stream.Write(msg, 0, msg.Length);
int bytes = stream.Read(data, 0, data.Length);
string rsp = Encoding.Latin1.GetString(data, 0, bytes);
EDIT2 So, i dont know what was the problem... just created a new project for .NET Framework versoin 4.7.2, in that project its worikng fine. Thanks for suggestions for everyone, credit goes to @Jeppe Stig Nielsen
CodePudding user response:
Encoding.Latin1
is not usable in your case. True Latin 1 does not contain ž (LATIN SMALL LETTER Z WITH CARON).
If you want Windows-1252, use
Encoding.GetEncoding("Windows-1252").GetString(data)
This will turn bytes of decimal value 158
(hex 0x9E
) into lowercase ž.
It may also be "Windows-1250"
that you have. What other non-English letters do you expect in your text? Compare Windows-1252 and Windows-1250; they are different in general, but both agree that hex byte 0x9E
(dec 158
) is ž.
When on a .NET Core system where the above does not work immediately, attempt to execute:
Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
var goodText = Encoding.GetEncoding("Windows-1252").GetString(data);
Finding the type CodePagesEncodingProvider
may need a reference to the assembly System.Text.Encoding.CodePages.dll.