Home > Blockchain >  Unable to search for newline character in string
Unable to search for newline character in string

Time:10-07

In C#, I'm looking for the index of a string inside a string - specifically the index where the newline (\n) character lives.

Given a string with Windows line breaks (\r\n):

If I look for "\n", it gives me -1. If I look for "\r\n", I get a result. If I look for '\n' as a character, I get a result.

Given a string with Unix line breaks (\n), I get a result.

string s = "hello\r\nworld";

Console.WriteLine(@"\r\n index: "   s.IndexOf("\r\n")); // 5
Console.WriteLine(@"\n index as string: "   s.IndexOf("\n")); // -1
Console.WriteLine(@"\n index as char: "   s.IndexOf('\n')); // 6


s = "hello\nworld";

Console.WriteLine(@"\n index as string: "   s.IndexOf("\n")); // 5
Console.WriteLine(@"\n index as char: "   s.IndexOf('\n')); // 5

I understand that line breaks are two characters, and if I was using StreamReader or File.ReadAllLines or something like that, then it would be handled automatically and I'd lose them.

I thought \n was a valid string by itself, and that \r\n, while special, still represented two separate and distinct characters in a string. But this is telling me otherwise.

I can do IndexOf on the character instead of the string ('\n' instead of "\n"), but I'd really like to know why this is happening so I can plan for it.

EDIT

FYI: Just found that converting the string to a Span gives the correct result. Not sure the overhead involved in that, so I don't know how this compares with the Ordinal solution - I'm guessing the Ordinal is the better one:

Console.WriteLine(@"\n index as string Ordinal: " 
      s.IndexOf("\n", StringComparison.Ordinal)); // 6

Console.WriteLine(@"\n index as Span: "
      s.AsSpan().IndexOf("\n".AsSpan())); // 6

Console.WriteLine(@"\n index as string with s.AsSpan(): " 
      s.AsSpan().IndexOf("\n")); // 6

CodePudding user response:

There was a change in .Net 5.0 with the globalization libraries for Windows. In previous versions, NLS was used on Windows and ICU on Unix. .Net 5 uses ICU on both to make cross platform development consistent, at the cost of surprising Windows developers (sigh). Due to this change, you must pass StringComparison.Ordinal to find newline in a string.

Note that this can also depend on the version of Windows (double sigh) as Windows 10 May 2019 includes the ICU library and earlier versions that don't will cause .Net 5 to fall back to NLS.

See this article from Microsoft. This article has more details on the APIs affected.

CodePudding user response:

You may use System.Environment.NewLine in your script, which is a conditional property for the newline character, depending on operating system. Check here.

On Windows: "\r\n".
On unix-platforms: "\n".

using System;
string s = "hello"   Environment.NewLine   "world";
  • Related