Home > other >  Python regex for deleting a given list of strings
Python regex for deleting a given list of strings

Time:11-18

I have the following code:

import re

pattern = r"(?s)FUNCTION [A-Z]{4,6}(.*?)\b\w*(?<!,)END\b"
regex = re.compile(pattern)

with open('functions.f', 'r') as input_file:
    with open('stripped.f', 'w') as output_file:
        result = regex.sub('', input_file.read())
        output_file.write(result)

It looks for functions with name length of 4 to 6 characters and deletes them. I would like to be able to delete only specific functions using a list of names so that if there are

FUNCTION ABCD
END
FUNCTION EFGHI
END
FUNCTION JKLM
,END
FUNCTION NOPQRS
END

and the given list of functions to be removed is [EFGHI,NOPQRS], only the corresponding functions are removed and only

FUNCTION ABCD
END
FUNCTION JKLM
,END

remain in the output file.

How can I achieve this?

CodePudding user response:

Replace [A-Z]{4,6} with the list of specific function names separated by |. Put this in a group to match it as a sub-unit in the regexp.

funcs = ['EFGHI','NOPQRS']
pattern = rf"(?s)FUNCTION ({'|'.join(funcs)}).*?\b\w*(?<!,)END\b"
  • Related