I have the following code:
import re
pattern = r"(?s)FUNCTION [A-Z]{4,6}(.*?)\b\w*(?<!,)END\b"
regex = re.compile(pattern)
with open('functions.f', 'r') as input_file:
with open('stripped.f', 'w') as output_file:
result = regex.sub('', input_file.read())
output_file.write(result)
It looks for functions with name length of 4 to 6 characters and deletes them. I would like to be able to delete only specific functions using a list of names so that if there are
FUNCTION ABCD
END
FUNCTION EFGHI
END
FUNCTION JKLM
,END
FUNCTION NOPQRS
END
and the given list of functions to be removed is [EFGHI,NOPQRS]
, only the corresponding functions are removed and only
FUNCTION ABCD
END
FUNCTION JKLM
,END
remain in the output file.
How can I achieve this?
CodePudding user response:
Replace [A-Z]{4,6}
with the list of specific function names separated by |
. Put this in a group to match it as a sub-unit in the regexp.
funcs = ['EFGHI','NOPQRS']
pattern = rf"(?s)FUNCTION ({'|'.join(funcs)}).*?\b\w*(?<!,)END\b"