Home > Net >  How to check with regex is two alternating words are not used in turns in a sentence?
How to check with regex is two alternating words are not used in turns in a sentence?

Time:12-25

I am trying to write what I thought would be a simple regex pattern, but it turned out to be unexpectedly complicated.

I am trying to detect if:

  1. Two alternating words are not used in turns in a sentence: do detect "Cat cat." do not detect "Cat dog."
  2. There can be one or more other words between these words: do detect "The cat chased another cat." do not detect "The cat chased another dog."
  3. The words can be present more than one time in the sentence: do detect: "The cat chased the dog after the cat had chased another cat." do not detect: "The cat chased the dog after the cat had chased another dog."
  4. The sentence may include punctuation: do detect: "The cat chased the cat, and another cat chased, well – another dog." do detect: "The cat chased the dog, and another cat chased, well – another dog."

I'm so far with (in Autohotkey):

    regex := "^(?:(?:(cat\b.*?(?<!cat)\bdog)|(dog\b.*?(?<!dog)\bcat)) |(?:cat|dog)\b.*?(?:cat|dog)\b)$"

string := "The cat chased the cat, and another cat chased, well – another dog." if (string ~= /regex/i) { MsgBox, in turns } else { MsgBox, not in turns }

But it does not work, and I'm stuck.

CodePudding user response:

Should be a piece of cake with the use of a regex backreference. So you could do something like:

/(\b\w \b).*\b\1\b/

This regex will match, if a word repeats itself in a string. You can play it with online.

CodePudding user response:

To rephrase the problem: exclude/ignore a word between 2 words OR determine a specific word order in a sentence.

(cat(?:(?!dog).)*cat)|(dog(?:(?!cat).)*dog)

This regex works like this:

  • (cat(?:(?!dog).)*cat) finds 2 cat words and no dog word between them
  • (dog(?:(?!cat).)*dog) finds 2 dog words and no cat word between them
  • (?:(?!dog) or (?:(?!cat) simply excludes cat or dog as a non-capturing group

regex101.com

"Antipattern" (whole negation, finds only correct sentences):

^((?!((cat(?:(?!dog).)*cat)|(dog(?:(?!cat).)*dog))).)*$

regex101.com

  • Related