I am trying to match on hyphens in a word but only if the hyphen occurs in said word say more than once
So in the phrase "Step-By-Step" the hyphens would be matched whereas in the phrase "Coca-Cola", the hyphens would not be matched.
In a full sentence combining phrases "Step-By-Step and Coca-Cola" only the hyphens within "Step-By-Step" would be expected to match.
I have the following expression currently, but this is matching all hyphens separated by non-digit characters regardless of occurences
((?=\D)-(?<=\D))
I can't seem to get the quantifiers to work with this expression, any ideas?
CodePudding user response:
Here is a way to match all hyphens in a line with more than one hyphen in PCRE:
(?:(?:^|\s)(?=(?:[^\s-]*-){2})|(?!^)\G)[^\s-]*\K-
Explanation:
[^\s-]*
matches a character that is not a whitespace and not a hyphen(?=(?:[^\s-]*-){2})
is lookahead to make sure there are at least 2 hyphens in a non-whitespace substring\G
asserts position at the end of the previous match or the start of the string for the first match\K
resets match info
CodePudding user response:
This matches at least two words each followed by hyphen, followed by another word (I'm assuming you don't want to allow hyphen at the very beginning or end, only between words).
(\w -){2,}\w