I have a block of text as shown below.
import re
one = """
ASDFABC
ABC
ABC
ABC
ASDF
ASDF
ASDF
ASDFABC"""
two = """\
ABC
ABC
ABC
ASDF
ASDF
ASDF
ASDFABC"""
I am searching for a way replace the whole block starting from ABC replaced by one single "TEST". for example, variable one should result in,
"""
ASDFABC
TEST
ASDF
ASDF
ASDF
ASDFABC"""
As a side note, ABC not starting from the beginning of the line could be anywhere as shown in first and last line in "one" and those should be ignored. Also as shown in "two" ABC is not necessarily followed by "\n"
How could this be done ?
Attempts made so far.
>>> re.findall(r"(?:\nABC.*) ", one)
['\nABC\nABC\nABC']
>>> re.findall(r"(?:\nABC.*) ", two)
['\nABC\nABC']
>>> re.findall(r"(?:\nABC.*) ", two, re.M)
['\nABC\nABC']
>>> re.findall(r"(?:\nABC.*) ", two, re.MULTILINE|re.DOTALL)
['\nABC\nABC\n\nASDF\nASDF\nASDF\nASDFABC']
>>> re.findall(r"(?:^ABC.*) ", two, re.MULTILINE|re.DOTALL)
['ABC\nABC\nABC\n\nASDF\nASDF\nASDF\nASDFABC']
>>> re.findall(r"(?:^ABC.*) ", two, re.MULTILINE)
['ABC', 'ABC', 'ABC']
>>> re.findall(r"(?:\n*ABC.*) ", two, re.MULTILINE)
['ABC\nABC\nABC', 'ABC']
CodePudding user response:
You can use
re.sub(r'^ABC(?:\nABC)*$', 'TEST', text, flags=re.M)
See the regex demo. Details:
^
- start of a line (due tore.M
)ABC
- a fixed string(?:\nABC)*
- zero or more repetitions of an LF char andABC
string$
- end of a line.
Note that flags=re.M
needs to be used with flags=
since the next positional attribute in re.sub
after the input string is a count attribute.