Home > Software engineering >  Match non-word characters until enclosing non-word characters if found
Match non-word characters until enclosing non-word characters if found

Time:12-02

I want to match any non-word character from the beginning and end position of a string, but if it comes across a non-word character which encloses a string, I want it to stop matching right before it.

I created the following regex pattern:

^\W (?!(\W )((?!\1).) \1)?

I expected it to match like is shown in the following image since it would match any non-word characters from the beginning of the string until it reached the enclosing quotes from the pattern in the negative lookahead:

!@#$%""<>test (string")- /;,

But the result was this:

!@#$%""<>test (string")- /;,

Regex101 Demo

What am I doing wrong?

CodePudding user response:

You need to use

^\W*?(\W)(?=.*?\b\1\W*$)

See the regex demo. Details:

  • ^ - start of string
  • \W*? - zero or more non-word chars, as few as possible
  • (\W) - a non-word char captured into Group 1
  • (?=.*?\b\1\W*$) - a positive lookahead that matches a location that is immediately followed with
    • .*? - any zero or more chars other than line break chars, as few as possible
    • \b - a word boundary
    • \1 - same value as in Group 1
    • \W* - zero or more non-word chars
    • $ - end of string.
  • Related