Home > Software design >  Split Regex matches into Groups
Split Regex matches into Groups

Time:02-12

I have markdown content with multiple images where each image is:

![Image Description](/image-path.png)

I am selecting all images in the markdown content using Regex:

var matches = new Regex(@"!\[.*?\]\((.*?)\)").Matches(content);

I am getting 2 groups:

Groups[0] = ![Image Description](/image-path.png);  > (Everything)

Groups[1] = /image-path.png                         > (Image Path)

Wouldn't be possible to get instead?

Groups[0] = Image Description.                      > (Image Description)

Groups[1] = /image-path.png                         > (Image Path)

CodePudding user response:

Currently the group 1 value is part of the matched string.

You could get the match for Image Description and only /image-path.png) in group 1 using a lookbehind and a lookahead with a capture group:

(?<=!\[)[^][]*(?=]\(([^()]*)\))

The pattern in parts matches:

  • (?<=![) Assert ![ to the left
  • [^][]*] Match any char except [ and ]
  • (?= Positive lookahead to assert to the right
    • ]\(([^()]*)\) Match ] and capture in group 1 what is between (...)
  • ) Close lookahead

enter image description here

CodePudding user response:

You can capture the relevant sections of your content text by using a capture group.

Compare your regex and mine, where I made a very small change by adding parentheses to capture the Image Description part of your content:

!\[.*?\]\((.*?)\)
!\[(.*?)\]\((.*?)\)

Capture groups are automatically numbered starting at index 1 so these groups are available as

  • matches[0].Groups[1]: which contains Image Description and
  • matches[0].Groups[2]: which contains /image-path.png

matches[0].Group[0] is still the whole match.

using System;
using System.Text.RegularExpressions;

public class Program
{
    public static void Main()
    {
            string content = @"![Image Description](/image-path.png)";
            var matches = new Regex(@"!\[(.*?)\]\((.*?)\)").Matches(content);
            Console.WriteLine(matches[0].Groups[1]);
            Console.WriteLine(matches[0].Groups[2]);
    }
}

This outputs:

Image Description
/image-path.png

Here is a Runnable .NET Fiddle of the above.

  • Related