I'm looking for a regular expression (using JavaScript) to match all 3-6 letter words that contain all or some subset of the letters in a given word. For example, if my word is "rescue", then I should match "use", "cure", "reuse", "secure", and so on. What I have right now is:
\b(?!\w*(\w)\w*\1)[rescue]{3,6}\b
...And this works, but for the repeat letters. It won't match words with 2 e's:
r
, e
, s
, c
, u
, e
, should match at most 1 r
, s
, c
, and u
, with at most 2 e
's.
Here I'm stuck. I'm not great at these things and it's a wonder I've come this far. This is not an operation I need to perform quickly or frequently, and the word list in only 20,000 words or so. I'm not concerned with the most efficient solution. I would love some help. Thanks.
CodePudding user response:
You don't need the negative lookahead, as how it is currently written, this part (?!\w*(\w)\w*\1)
asserts that what is to the right is not a word char (\w)
that can occur again with \w*\1
Instead you could assert the length first, and then match one of the characters r
e
s
c
u
e
\b(?!\w*([rescue])\1)(?=\w{3,6}\b)\w*[rescue]\w*\b
See a regex 101 demo.
If the word can only consist of the listed characters:
\b(?!\w*([rescue])\1)[rescue]{3,6}\b
See another regex 101 demo.