Home > Blockchain >  Python regex: Match any text excluding 2 words (probably look arounds)
Python regex: Match any text excluding 2 words (probably look arounds)

Time:06-21

I am trying to make a regex that matches tags like this (so I can substitute them with blanks):

{# This is a comment #}
{% if cows > pandas %}
{{ my_variable }}

However if the first word in the tags is includes or extends then it must not match the tag (it will always be paired with {% %} tokens fyi).

e.g.

{% include 'foo.html' %}
{% extends 'foo/bar/baz.html' %}

I have the following python regex

{[{#%]\s*.*?(?!include|extends)[}#%]}

However the negative assertion is not working (i.e include and exclude are matched below):

{# match this #}
{% match this %}
{{ match this }}
asdf
{% foo 'match this' %} asdf {% foo 'match this' %} asdf
{% include 'not this' %}
{% extends 'not this' %}
asdf

Note: Yes this is to do with Django templating if you are interested!

CodePudding user response:

You can use

{[{#%](?!\s*(?:include|extends)\b).*?[}#%]}

See the regex demo.

Details:

  • { - a { char
  • [{#%] - a {, # or % char
  • (?!\s*(?:include|extends)\b) - a negative lookahead that fails the match if there are zero or more whitespaces followed with include or extends words as whole words immediately to the right of the current location
  • .*? - any zero or more chars other than line break chars as few as possible
  • [}#%] - a }, # or % char
  • } - a } char.
  • Related