Home > front end >  How to capture brackets with variable in-between amount of space as a single group?
How to capture brackets with variable in-between amount of space as a single group?

Time:12-16

Suppose I have the following text:

Yes: [x]
Yes: [  x]
Yes: [x  ]
Yes: [  x  ]
No: [
No: ]

I am interested in capturing the angular brackets [ and ] containing an x with a variable amount of horizontal space on either side of the x. The bit I am struggling with is that both angular brackets must be captured into a group with the same ID (i.e., $1).

I started with a combination of positive lookahead and lookbehind assertions using the following regex:

\[(?=\h*x)|(?<=x)\h*\K\]

Which produces the following matches (i.e., see Example first attempt

Then, I tried placing a capturing group around the whole expression, but the match extends to the horizontal space after the positive lookbehind (?<=x)\h* as shown below (i.e., also see Example second attempt

I am using Oniguruma regular expressions and the PCRE flavor. Do you have any ideas if and how this can be done?

CodePudding user response:

You could make use of a branch reset group:

(?|(\[)(?=\h*x\h*])|(?<=\[)\h*x\h*(]))
  • (?| Branch reset group
    • (\[)(?=\h*x\h*]) Capture [ in group 1, asserting x between optional horizontal whitespace chars to the right followed by ]
    • | Or
    • (?<=\[)\h*x\h*(]) Assert [ to the left, then match x between optional horizontal whitespace and capture ] in group 2
  • ) Close branch reset group

Regex demo

  • Related