Home > Mobile >  REGEX that matches a paragraph that doesn't contain a word avoiding \n
REGEX that matches a paragraph that doesn't contain a word avoiding \n

Time:12-04

I have a REGEX that finds a word inside a paragraph, while avoiding \n. (?i)(?<=\b|\\n)cat\b

  1. Something something \ncat\n - match
  2. Something something cat - match
  3. Something something cats - no match, as expected

I want the negative of this REGEX - does a paragraph not contain the word.

  1. Something something \ncat\n - no match - contains the word
  2. Something something cat - no match - contains the word
  3. Something something cats - match
  4. Something something \ncatss\n - match

I've tried a Negative lookbehind but that doesn't seem to work

CodePudding user response:

To negate a regex, you can use a negative lookahead assertion. This will match any character that is not followed by the specified pattern. For example, to match a paragraph that does not contain the word "cat", you could use the following regex:

^(?!.*\bcat\b).*$

This regex uses a negative lookahead assertion ((?!...)) to match any character that is not followed by a word boundary (\b), the word "cat", and another word boundary. The ^ and $ anchors are used to match the start and end of the paragraph, respectively.

Here's an example of how you could use this regex in your code:

val regex = Regex("^(?!.*\\bcat\\b).*$")
val paragraph = "Something something cat"

if (regex.matches(paragraph)) {
    // paragraph does not contain the word "cat"
}

This code will match the paragraph "Something something cat" because it does not contain the word "cat". It will not match the paragraph "Something something \ncat\n" because it does contain the word "cat".

CodePudding user response:

If I understand correctly you have a string that contains several lines of text (separated by \n) and you want to find out if the given word cat is not in the string as a separate word, where a 'separate word' is a sequence of normal characters bounded by a word boundary or newline.

Newlines are word boundaries, so only using \b will do.

This can be achieved by looking for the presence of cat and then negate the outcome. Using the scala repl:

scala> val wordRE = raw"\bcat\b".r.unanchored
val wordRE: scala.util.matching.UnanchoredRegex = \bcat\b

scala> !wordRE.matches("Something something \ncat\n")
val res1: Boolean = false

scala> !wordRE.matches("Something something cat")
val res2: Boolean = false

scala> !wordRE.matches("Something something cats")
val res3: Boolean = true

scala> !wordRE.matches("Something something \ncatss\n")
val res4: Boolean = true
  • Related