I am looking for a regex that returns me a string before the first occurrence of \n ignoring escape sequence at the beginning of the string. I am using "([^\n]) " regex which works fine if the strings are as below
- "Excel Sheet for A and A Plants BCC-EFS-000-F-US \nPONI-DSUIK-4748-F-USKJH" return "Excel Sheet for A and A Plants BCC-EFS-000-F-US"
- "\n\nMutiple Sheet for Data for materials\n and Services" returns "Mutiple Sheet for Data for materials"
But it starts failing if there are other escape characters in the beginning like \t or \r eg "\t\n\tTransportation and Railways\n\t\n\n\t\n\tDocument." returns "t" and expected is "Transportation and Railways"
Is it possible to achieve the same in a single Regex?
CodePudding user response:
Capture groups let you include important context surrounding the content you want to extract. Here we use a named capture group (?<content>)
. The character class before it [\t\r\n]*
absorbs all \t
, \r
, and \n
characters preceding content
.
var regex = new Regex(@"[\t\r\n]*(?<content>[^\n] )");
var match = regex.Match(str);
if (match.Success) {
var content = match.Groups["content"].Value;
...
}