Home > Net >  Regex to delete multi line content between two specific words
Regex to delete multi line content between two specific words

Time:11-17

I have multiple instances of Fortran subroutines within a text file like the following:

SUBROUTINE ABCDEF(STRING1)
STRING2
STRING3
.
.
.
STRINGN
      END

How can I delete the subroutines with their content in python using regex?

I have already tried this piece of code without success:

with open(input, 'r') as file:
    output = open(stripped, 'w')
    try:
        for line in file:
            result = re.sub(r"(?s)SUBROUTINE [A-Z]{6}(.*?)\bEND\b", input)
            output.write("\n")
    finally:
        output.close()

CodePudding user response:

Does this work? I replaced input with input_file as input is a builtin function, so it's bad practice to use it.

pattern = r"(?s)SUBROUTINE [A-Z]{6}(.*?)\bEND\b"
regex = re.compile(pattern, re.MULTILINE|re.DOTALL)
with open(input_file, 'r') as file:
    with open(stripped, 'w') as output_file:
        result = regex.sub('', file.read())
        output_file.write(result)
  • Related