Currently I have this for my regex replacement:
import re
line = re.sub(r"\bstr\b", ":class:`str`", line)
I want a result like below, where the <
, >
, and ` stop replacement from occurring, and no replacement occurs if inside square brackets. I tried implementing negative lookarounds for just one of the characters, but I couldn't make it work.
Example input:
line = r"""list of str and list of :class:`str` or :class:`string <str>` or Union[str, Tuple[int, str]]"""
Example output of what I am aiming for:
"list of :class:`str` and list of :class:`str` or :class:`string <str>` or Union[str, Tuple[int, str]]"
CodePudding user response:
Here is a solution with negative lookbehind and negative lookahead.
line = r"""list of str and list of :class:`str` or :class:`string <str>` or Union[str, Tuple[int, str]]"""
pattern = r"(?<![\[`<])(str)(?![\]`>])"
re.sub(pattern, r":class:`str`", line)
Output:
list of :class:`str` and list of :class:`str` or :class:`string <str>` or Union[str, Tuple[int, str]]
Check the Regex on Regex101
UPDATE on question in the comments.
Here is my conditional sub approach,
based on the idea of this approach by @Valdi_Bo
line = r"""list of str and list of :class:`str` or :class:`string <str>` or Union[str, Tuple[int, str]]"""
pattern = r"\bstr\b"
def conditional_sub(match):
if not line[match.start()-1] in ['[', '`','<'] and not line[match.end()] in [']', '`', '>']:
return r":class:`str`"
else:
return r"~str"
re.sub(pattern, conditional_sub, line)
Output:
list of :class:`str` and list of :class:`~str` or :class:`string <~str>` or Union[~str, Tuple[int, ~str]]
match.start()
and match.end()
are just index numbers. With them we can check for the symbols before/after like in the pattern before and decide what to replace.