For a list of file names file_names
, I try to use code below to filter file names not containing foo
or bar
:
file_names = ['foo_data.xlsx', 'bar_data.xlsx', 'data.xlsx']
subs = ['foo', 'bar']
for file_name in file_names:
for sub in subs:
if sub not in file_name:
print(file_name)
Output:
foo_data.xlsx
bar_data.xlsx
data.xlsx
data.xlsx
But it's not working out, it should return data.xlsx
.
Meanwhile, it works for containing case:
file_names = ['foo_data.xlsx', 'bar_data.xlsx', 'data.xlsx']
subs = ['foo', 'bar']
for file_name in file_names:
for sub in subs:
if sub in file_name:
print(file_name)
Out:
foo_data.xlsx
bar_data.xlsx
Does someone could help to explain what's error in my code and how to fix it? Thanks.
Reference:
Does Python have a string 'contains' substring method?
CodePudding user response:
Since you don't want any sub
to be in the file names; one way is to wrap the inner loop with all
:
for file_name in file_names:
if all(sub not in file_name for sub in subs):
print(file_name)
Output:
data.xlsx
CodePudding user response:
One regex approach would be to form an alternation of the blacklist substrings, then use re.search
and a list comprehension to find the matches.
file_names = ['foo_data.xlsx', 'bar_data.xlsx', 'data.xlsx']
subs = ['foo', 'bar']
regex = r'(?:' '|'.join(subs) r')'
matches = [f for f in file_names if not re.search(regex, f)]
print(matches) # ['data.xlsx']