I try to match a string in my C# application with regex :
MRT01_60DOOO3-0013577
The rules are : MRT01_60D can contain mutltiple underscores like MRT_01_02_60D but after the last underscore the string must be : integer 'D' or integer 'M' like :
MRT_01_02_620D or
MRT_01_02_60M or
MRT_03_12D
the last part : OOO3-0013577 length must always be 12 characters and '-' at the 5th position.
After have checked the match of the string, I would like to get 4 parts :
'MRT01'
'60D'
'OOO3-0013'
'577'
Could you help me to find the regex ?
Thanks a lot in advance.
Eric.
CodePudding user response:
For these 2 strings:
MRT01_60DOOO3-0013577
MRT_01_60MOOO3-0013577
The regex (MRT.*?\d{2}).*(\d{2}D|\d{2}M)(.{4}-.{4})(.{3})
will match the following:
group 1 MRT01
group 2 60D
group 3 OOO3-0013
group 4 577
group 1 MRT_01
group 2 60M
group 3 OOO3-0013
group 4 577
After capturing and structuring the strings, in your language of choice (C#
), just use replacements for characters you don't want like _
CodePudding user response:
(MRT_?\d\d).*?(\d*[DM])((.{4}-.{4}))(.{3})
(MRT_?\d\d)
finds the MRT part with 2 digits (\d
) and an optional (?
) underscore.*?
makes it possible for any optional digits with underscore to be there. Non-greedy (*?
) because it would match the rest of the string otherwise.(\d*[DM])
finds any number (*
) of digits digits (\d
) ending with D or M ([DM]
)((.{4}-.{4}))
finds 4 ({4}
) arbitrary characters (.
) with a hyphen in the middle and another 4 characters(.{3})
finds 3 characters at the end
Check it at Regex101
All items you want to find are put in groups (()
), so check out group 1 to 4 of the match.