In C#, I'm looking for the index of a string inside a string - specifically the index where the newline (\n
) character lives.
Given a string with Windows line breaks (\r\n
):
If I look for "\n"
, it gives me -1. If I look for "\r\n"
, I get a result. If I look for '\n'
as a character, I get a result.
Given a string with Unix line breaks (\n
), I get a result.
string s = "hello\r\nworld";
Console.WriteLine(@"\r\n index: " s.IndexOf("\r\n")); // 5
Console.WriteLine(@"\n index as string: " s.IndexOf("\n")); // -1
Console.WriteLine(@"\n index as char: " s.IndexOf('\n')); // 6
s = "hello\nworld";
Console.WriteLine(@"\n index as string: " s.IndexOf("\n")); // 5
Console.WriteLine(@"\n index as char: " s.IndexOf('\n')); // 5
I understand that line breaks are two characters, and if I was using StreamReader or File.ReadAllLines or something like that, then it would be handled automatically and I'd lose them.
I thought \n
was a valid string by itself, and that \r\n
, while special, still represented two separate and distinct characters in a string. But this is telling me otherwise.
I can do IndexOf on the character instead of the string ('\n'
instead of "\n"
), but I'd really like to know why this is happening so I can plan for it.
EDIT
FYI: Just found that converting the string to a Span
gives the correct result. Not sure the overhead involved in that, so I don't know how this compares with the Ordinal solution - I'm guessing the Ordinal is the better one:
Console.WriteLine(@"\n index as string Ordinal: "
s.IndexOf("\n", StringComparison.Ordinal)); // 6
Console.WriteLine(@"\n index as Span: "
s.AsSpan().IndexOf("\n".AsSpan())); // 6
Console.WriteLine(@"\n index as string with s.AsSpan(): "
s.AsSpan().IndexOf("\n")); // 6
CodePudding user response:
There was a change in .Net 5.0 with the globalization libraries for Windows. In previous versions, NLS was used on Windows and ICU on Unix. .Net 5 uses ICU on both to make cross platform development consistent, at the cost of surprising Windows developers (sigh). Due to this change, you must pass StringComparison.Ordinal
to find newline in a string.
Note that this can also depend on the version of Windows (double sigh) as Windows 10 May 2019 includes the ICU library and earlier versions that don't will cause .Net 5 to fall back to NLS.
See this article from Microsoft. This article has more details on the APIs affected.
CodePudding user response:
You may use System.Environment.NewLine
in your script, which is a conditional property for the newline character, depending on operating system. Check here.
On Windows: "\r\n"
.
On unix-platforms: "\n"
.
using System;
string s = "hello" Environment.NewLine "world";