Home > Back-end >  Match pattern and get parts of string
Match pattern and get parts of string

Time:10-03

I try to match a string in my C# application with regex :

MRT01_60DOOO3-0013577

The rules are : MRT01_60D can contain mutltiple underscores like MRT_01_02_60D but after the last underscore the string must be : integer 'D' or integer 'M' like :

MRT_01_02_620D or

MRT_01_02_60M or

MRT_03_12D

the last part : OOO3-0013577 length must always be 12 characters and '-' at the 5th position.

After have checked the match of the string, I would like to get 4 parts :

'MRT01'

'60D'

'OOO3-0013'

'577'

Could you help me to find the regex ?

Thanks a lot in advance.

Eric.

CodePudding user response:

For these 2 strings:

MRT01_60DOOO3-0013577
MRT_01_60MOOO3-0013577

The regex (MRT.*?\d{2}).*(\d{2}D|\d{2}M)(.{4}-.{4})(.{3}) will match the following:

group 1     MRT01
group 2     60D
group 3     OOO3-0013
group 4     577

group 1     MRT_01
group 2     60M
group 3     OOO3-0013
group 4     577

After capturing and structuring the strings, in your language of choice (C#), just use replacements for characters you don't want like _

CodePudding user response:

(MRT_?\d\d).*?(\d*[DM])((.{4}-.{4}))(.{3})
  • (MRT_?\d\d) finds the MRT part with 2 digits (\d) and an optional (?) underscore
  • .*? makes it possible for any optional digits with underscore to be there. Non-greedy (*?) because it would match the rest of the string otherwise.
  • (\d*[DM]) finds any number (*) of digits digits (\d) ending with D or M ([DM])
  • ((.{4}-.{4})) finds 4 ({4}) arbitrary characters (.) with a hyphen in the middle and another 4 characters
  • (.{3}) finds 3 characters at the end

Check it at Regex101

All items you want to find are put in groups (()), so check out group 1 to 4 of the match.

  • Related