Home > Software engineering >  Tied optionality of regex groups without duplicating the obligatory pattern part
Tied optionality of regex groups without duplicating the obligatory pattern part

Time:10-22

I have a regex like "(?<opening>\[)?(?<body>\w )(?<closing>\])?".
This is in .NET.
Currently both opening and closing are optional and independent.
So the question is: is it possible to make the closing match only if opening was encountered, otherwise treat as a mismatch?

Currently it matches all of possible variants: body, [body, body], [body].
But my aim is to match either body or [body]

P.S. I know it's possible via ((?<opening>\[)(?<body>\w )(?<closing>\])|(?<body>\w )),
but my actual <body> pattern is quite big and complicated to duplicate it like that.

CodePudding user response:

For the current scenario, you can use

(?:(?<o>\[)|(?<!\[))\b(?<body>\w )(?(o)(?<c>])|(?![]\w]))

See the .NET regex demo. Details:

  • (?:(?<o>\[)|(?<!\[))
  • \b - a word boundary (it works here since the next pattern part matches a word char)
  • (?<body>\w ) - Group "body": one or more word chars
  • (?(o)(?<c>])|(?![]\w])) - a conditional construct that, if Group "o" stack is not empty,
    • (?<c>]) matches and captures into Group "c" a ] char,
    • | - or else (if Group "o" did not match)
    • (?![]\w]) - requires that there is no ] and a word char immediately to the right of the current location.
  • Related