Home > Net >  Get a substring from a substring of a string in C# using regex
Get a substring from a substring of a string in C# using regex

Time:12-07

I want to extract a substring that follows a specific substring and ends with a specific character from a long string.

I have a long string : "<can be whatever characters or string>CSVSegment.ab12\<can be whatever characters or string>CSVSegment.cd34\<can be whatever>"

I want to extract just ab12 and cd34, basically whatever comes after CSVSegment and ends before \

Currently I am doing

Regex pattern = new Regex(@"CSVSegment.(?<SegmentName>))\");
Match match = pattern.Match(longstring);

I dont know how to use groups for this, and how i can get a list of strings that follow CSVSegment. and end before \ for all the occurrences in the long string.

CodePudding user response:

As far as I can see, you want to match

ab12
cd34

from

....CSVSegment.ab12\....
....CSVSegment.cd34\....

you can do it with CSVSegment\.(?<SegmentName>[^\\]*)\\ pattern:

Regex pattern = new Regex(@"CSVSegment\.(?<SegmentName>[^\\]*)\\");

var result = pattern
  .Match(source)
  .Groups["SegmentName"]
  .Value;

Or if you want all matches (fiddle)

var results = pattern
  .Matches(source)
  .Cast<Match>()
  .Select(m => m.Groups["SegmentName"].Value)
  .ToArray(); 

Pattern CSVSegment\.(?<SegmentName>[^\\]*)\\ explained:

CSVSegment\.           - prefix, note that dot ('.') has been escaped
(?<SegmentName>[^\\]*) - zero or more any symbols but \
\\                     - suffix \, note escapement

Here we need escapement since . (dot) means any character and \ means an escapement symbol.

CodePudding user response:

You need something like:

Regex pattern = new Regex(@"CSVSegment\.(?<sn>[a-z0-9]{4})\");
MatchCollection matches = pattern.Matches(longstring);

foreach(Match m in mc)
  Console.WriteLine(m.Groups["sn"].Value);

The only part where you went really off track was not putting a pattern in inside the named capturing group. Whatever is matched by a pattern inside the group, ends up in the group Value. I put a pattern of [a-z0-9]{4} which matches the ab12 and cd34 examples but if your data is more wide ranging you may want to alter this pattern to match the values that will be seen

  • Related