I am having a hard time coming up with a regex to match a specific case:
This can be matched:
any-dashed-strings
this-can-be-matched-even-though-its-big
This cannot be matched:
strings starting with elem- or asdf- or a single -
elem-this-cannot-be-matched
asdf-this-cannot-be-matched
-
So far what I came up with is:
/\b(?!elem-|asdf-)([\w\-] )\b/
But I keep matching a single -
and the whole -this-cannot-be-matched
suffix. I cannot figure it out how to not only ignore a character present inside the matching character class conditionally, and not matching anything else if a suffix is found
I am currently working with the Oniguruma engine (Ruby 1.9 /PHP multi-byte string module).
If possible, please elaborate on the solution. Thanks a lot!
CodePudding user response:
If a lookbehind is supported, you can assert a whitespace boundary to the left, and make the alternation for both words without the hyphen optional.
(?<!\S)(?!(?:elem|asdf)?-)[\w-] \b
Explanation
(?<!\S)
Assert a whitespace boundary to the left(?!
Negative lookahead, assert the directly to the right is not(?:elem|asdf)?-
Optionally matchelem
orasdf
followed by-
)
Close the lookahead[\w-]
Match 1 word chars or-
\b
A word boundary
See a regex demo.
Or a version with a capture group and without a lookbehind:
(?:\s|^)(?!(?:elem|asdf)?-)([\w-] )\b
See another regex demo.