Home > Enterprise >  Python regular expression escaping
Python regular expression escaping

Time:07-09

I am trying to use (?<!\\)# to match #s without a \ in the front (the task is to escape the unescaped #s in a string). This regex works on several online regex validators. However it doesn't work with the python re module. I also tried escaping the symbols in the regex, but it either produces errors or does not produce the expected output.

re.sub("(?<!\\)#","\#",'asd\#fh## #')

How can I modify this regex so it can produce the output asd\\#fh\\#\\# \\#(The output has \s escaped so there are double \\)?

CodePudding user response:

You have few issues in your code:

  1. 'asd\#fh## #' is same as 'asd#fh## #' in normal string (unless you use raw string mode)
  2. Likewise "\#" in replacement is same as just #
  3. Similarly "(?<!\\)#" will generate regex syntax error as it will become (?<!\)# without matching ) for negative lookahead

You need to use raw string mode or use double escaping to get it right:

repl = re.sub(r"(?<!\\)#", r"\#", r'asd\#fh## #')
# repl = 'asd\\#fh\\#\\# \\#'
  • Related