Home > Software engineering >  Regex not strictly following negative lookahead?
Regex not strictly following negative lookahead?

Time:05-10

I have a regex which is matching urls which don't have quotes (double or single) at the end and have & (ampersand) in the url.

The regex i made

([^'` "\n] )\.([^ \n] )&([^ "`'\n] )(?!["'])

but it's just not taking the last word and matching the url

enter image description here Take the example of picture above

google.com/cool?cool1=yes&cool2=no&cool3=no"

the url should not match as it have " in the end

but it's just not matching 'o' and matching the remaining url.

All I wanted to do is if this double quote is present in the end then just don't match the whole url.

CodePudding user response:

You need to make the lookahead active on the whole part after the ampersand. We then have the option of

  • $ end of the line
    or
  • (?=\s) positive lookahead for a space.
([^' "\n] )\.([^ \n] )&((?!["'])[^ "'\n]) ($|(?=\s)

See https://regex101.com/r/6ZGpSX/1

CodePudding user response:

For a match only, you can omit the capture groups, and use a negated character class and you should omit the backtick ` from the negated character class if you want to allow to match it.

[^'"\s.] \.[^\s'"&] &[^\s"'] (?!\S)

Explanation

  • [^'"\s.] Match 1 non whitespace chars other than " ' .
  • \. Match a dot
  • [^\s'"&] Match 1 non whitespace chars other than " ' &
  • & Match literally
  • [^\s"'] Match 1 non whitespace chars other than " '
  • (?!\S) Assert a whitespace boundary to the right

See a regex demo.

  • Related