I'm trying to match a pair of special characters, while excluding the enclosed content from the match. For example, ~some enclosed content~
should match only the pair of ~
and leave out some enclosed content
entirely. I can only use vanilla PCRE, and capture groups aren't an option.
My match criteria for the entire string is ~([^\s].*?(?<!\s))~
. Matching the first and second ~
separately would also be acceptable.
CodePudding user response:
Looking at your pattern, you want a non whitespace char right after the opening ~
and a non whitespace char right before the closing ~
As those are the delimiters, and the non whitespace char should also not be ~
itself, you might use:
~(?=[^~\s](?:[^~\r\n]*[^\s~])?~)|(?<=~)[^\s~](?:[^~\r\n]*[^\s~])?\K~
Explanation
~
Match literally(?=
Positive lookahead, assert that to the right is[^~\s]
Match a non whitespace char except for~
(?:
Non capture group[^~\r\n]*[^\s~]
Match repeating any char other than a newline or~
followed by a non whitespace char except for~
)?
Close non capture group and make it optional (to also match a single char~a~
)~
Match literally
)
Close the lookahead|
Or(?<=~)
Positive lookbehind, assert~
to the left[^\s~]
Match a non whitespace char except for~
(?:[^~\r\n]*[^\s~])?
Same optional pattern as in the lookahead\K
Forget what is matched so far~
Match literally