I am trying to match these words with regex:
match | don't match |
---|---|
allochirally | anticker |
anticovenanting | corundum |
barbary | crabcatcher |
calelectrical | damnably |
entablement | foxtailed |
ethanethiol | galvanotactic |
froufrou | gummage |
furfuryl | gurniad |
galagala | hypergoddess |
heavyheaded | kashga |
linguatuline | nonimitative |
mathematic | parsonage |
monoammonium | pouchlike |
perpera | presumptuously |
photophonic | pylar |
purpuraceous | rachioparalysis |
salpingonasal | scherzando |
testes | swayed |
trisectrix | unbridledness |
undergrounder | unupbraidingly |
untaunted | wellside |
As you can tell there is a pattern in the match column, such that every word has their first three letters appear again in the same order in the word.
CodePudding user response:
"Finding" a certain string, as you say, can be operationalized as "extracting". If your goal, then, is to extract the three letters that get repeated within the words, you can use this pattern:
(\w{3})(?=.*\1)
which can be simplified if you have only alphabetic characters to:
(.{3})(?=.*\1)
The syntax here relies on two elements:
(.{3})
: a capture group matching exactly three characters(?=.*\1)
: a look ahead asserting that the same sequence of three characters must re-occur after zero or more intervening characters
CodePudding user response:
Following regex is can be used to match the words
([a-zA-Z]{3})[a-zA-Z]*\1[a-zA-Z]*
([a-zA-Z]{3}) - Groups first 3 letters
[a-zA-Z]* - Allows any letters to be matched
\1 - Matches the grouped first first group (3 letters)
[a-zA-Z]* - Allows any letters to be matched