Home > Net >  How to perform this conditional regex replacement task, using lookarounds for bracket & quotation ch
How to perform this conditional regex replacement task, using lookarounds for bracket & quotation ch

Time:07-31

Currently I have this for my regex replacement:

import re
line = re.sub(r"\bstr\b", ":class:`str`", line)

I want a result like below, where the <, >, and ` stop replacement from occurring, and no replacement occurs if inside square brackets. I tried implementing negative lookarounds for just one of the characters, but I couldn't make it work.

Example input:

line = r"""list of str and list of :class:`str` or :class:`string <str>` or Union[str, Tuple[int, str]]"""

Example output of what I am aiming for:

"list of :class:`str` and list of :class:`str` or :class:`string <str>` or Union[str, Tuple[int, str]]"

CodePudding user response:

Here is a solution with negative lookbehind and negative lookahead.

line = r"""list of str and list of :class:`str` or :class:`string <str>` or Union[str, Tuple[int, str]]"""
pattern = r"(?<![\[`<])(str)(?![\]`>])"
re.sub(pattern, r":class:`str`", line)

Output:

list of :class:`str` and list of :class:`str` or :class:`string <str>` or Union[str, Tuple[int, str]]

Check the Regex on Regex101

UPDATE on question in the comments.
Here is my conditional sub approach, based on the idea of this approach by @Valdi_Bo

line = r"""list of str and list of :class:`str` or :class:`string <str>` or Union[str, Tuple[int, str]]"""
pattern = r"\bstr\b"
def conditional_sub(match):
    if not line[match.start()-1] in ['[', '`','<'] and not line[match.end()] in [']', '`', '>']:
        return r":class:`str`"
    else:
        return r"~str"

re.sub(pattern, conditional_sub, line)

Output:

list of :class:`str` and list of :class:`~str` or :class:`string <~str>` or Union[~str, Tuple[int, ~str]]

match.start() and match.end() are just index numbers. With them we can check for the symbols before/after like in the pattern before and decide what to replace.

  • Related