Home > Software engineering >  Combining regex to retrieve match cases
Combining regex to retrieve match cases

Time:06-23

I have the following string

[1] weight | width | depth | 5.0 cm | 6.0 mm^2 | 10.12 cm^3

From that I need to extract the name, value and units from the above string like below

name = weight
value = 5.0
unit = cm

name = width
value = 6.0
unit = cm^2

name = depth
value = 10.12
unit = cm^3

I have the following regexes for each match cases. Individually each one is working as expected. But combining the regex is needed, so it will return the expected match cases. I tried just combining them all and also using |. But not worked. Here is the working regex for individual matches

For Name : (?<name>\b\w (?:[\w]\w ) \b)
For Value : (?<![\^])(?<value>[ -]?[0-9] (?:\.[0-9] )?)(?!\S)
For Unit : \b[0-9] (?:\.[0-9] )?[^\S\r\n] (?<unit>[^0-9\s]\S*)(?:[^\S\r\n] \||$)

Can anyone help me on this. Thanks

CodePudding user response:

If there are the same amount of pipes, you can use a capture group for name, and capture value and unit in a lookahead:

(?<!\S)(?<name>\w )(?=(?:[^|]*\|){3}\s*\b(?<value>[0-9] (?:\.[0-9] )?)\s (?<unit>[^0-9\s]\S*))

Regex demo

CodePudding user response:

Use this regex to capture the corresponding groups

\[\d \]\s(\w )\s\|\s(\w )\s\|\s(\w )\s\|\s(\S )\s(\S )\s\|\s(\S )\s(\S )\s\|\s(\S )\s(\S )

Then using substitution replace with

name = $1\nvalue = $4\nunit = $5\n\nname = $2\nvalue = $6\nunit = $7\n\nname = $3\nvalue = $8\nunit = $9

See the regex demo. Also, see C# demo.

CodePudding user response:

Just for reference on how you could use the pattern provided by @TheFourthBird

using System;
using System.Text.RegularExpressions;
using System.Linq;
                    
public class Program
{
    public static void Main()
    {
        string s = "[1] weight | width | depth | 5.0 cm | 6.0 mm^2 | 10.12 cm^3";
        int n = s.Split('|').Length / 2;
        string pat = @"(?<!\S)(?<name>\w )(?=(?:[^|]*\|){"   n   @"}\s*\b(?<value>[0-9] (?:\.[0-9] )?)\s (?<unit>[^0-9\s]\S*))";
        
        var ItemRegex = new Regex(pat, RegexOptions.Compiled);
        var OrderList = ItemRegex.Matches(s)
                            .Cast<Match>()
                            .Select(m => new
                            {
                                Name = m.Groups["name"].ToString(),
                                Value = Convert.ToDouble(m.Groups["value"].ToString()),
                                Unit = m.Groups["unit"].ToString(),
                            })
                            .ToList();
        Console.WriteLine(String.Join("; ", OrderList));
    }
}

Prints:

{ Name = weight, Value = 5, Unit = cm }; { Name = width, Value = 6, Unit = mm^2 }; { Name = depth, Value = 10.12, Unit = cm^3 }

Give it a go with other samples here


Note: By no means am I an c# developer. I just so happen to adjust code found here on SO to showcase how the answer given by TheFourthBird could work.

  • Related