Home > Enterprise >  Extract tokens from the string C# regex
Extract tokens from the string C# regex

Time:03-16

I got string which needs to be separated by pipe |.

The numeric tokens can be defined without wrapping in anything like 20 and 50 in the example below or could be defined in [] or {}.

The string token will be either wrapped in [] or {} and can have any special characters including | separator within the token. They cannot have [] or {} within the token string.

[Name1]|20|[Nam|2]|{Na;me,3}|50|[Na|me!@#$%^&*()Finish]|[25]|{67}

Need to extract above string to following tokens:

Name1

20

Name|2

Na;me,3

50

Na|me!@#$%^&*()Finish

25

67

How can we do that in C#? Is regular expressions best way to go about it?

CodePudding user response:

You can extract them with

\[(?<r>[^][]*)]|\{(?<r>[^{}]*)}|(?<r>[^|] )

See the regex demo. Details:

  • \[(?<r>[^][]*)] - [, then any zero or more chars other than [ and ] captured into Group "r", and then a ] char
  • | - or
  • \{(?<r>[^{}]*)} - {, then any zero or more chars other than { and } captured into Group "r", and then a } char
  • | - or
  • (?<r>[^|] ) - any one or more chars other than a | char captured in Group "r".

See the C# demo:

var text = "[Name1]|20|[Nam|2]|{Na;me,3}|50|[Na|me!@#$%^&*()Finish]|[25]|{67}";
var pattern = @"\[(?<r>[^][]*)]|\{(?<r>[^{}]*)}|(?<r>[^|] )";
var result = Regex.Matches(text, pattern).Cast<Match>().Select(x => x.Groups["r"].Value);
foreach (var s in result)
    Console.WriteLine(s);

Output:

Name1
20
Nam|2
Na;me,3
50
Na|me!@#$%^&*()Finish
25
67
  • Related