I have a list which has collection of filepaths and i want to extract the paths which only contains 'mp4'
.
lists = ['/Users/me/1. intro.mp4', 'The mp4 version.vlc'
'/Users/2. intro.vtt', '/Users/1. ppt.rar', '/Users/2. ppt.mp4']
Expected output:
['/Users/me/1. intro.mp4', 'The mp4 version.vlc','/Users/2. ppt.mp4']
I tried the below code but its not exactly giving me the correct output. My code looks:
lists = ['/Users/me/1. intro.mp4',
'/Users/2. intro.vtt', '/Users/1. ppt.rar', '/Users/2. ppt.mp4']
def Filter(string, substr):
return [str for str in string if
any(sub in str for sub in substr)]
searchString = 'mp4'
result = Filter(lists, searchString)
print(f'{result}')
If I run the program, it gives me the following output:
['/Users/me/1. intro.mp4', '/Users/1. ppt.rar', '/Users/2. ppt.mp4']
Can anybody tell me how to fix?
CodePudding user response:
You just need to check if substr
is in each item in the list.
def Filter(string, substr):
return [item for item in string if substr in item]
Your code, i.e.
any(sub in str for sub in substr)
checks if ANY of the letters 'm', 'p', or '4' are in str
, since you have a nested comprehension that iterates through each character in substr
itself.
I would also not use 'str
' as a variable name as you've done, since it's used for the built-in str
class.
CodePudding user response:
Try This:
lists = ['/Users/me/1. intro.mp4',
'/Users/2. intro.vtt', '/Users/1. ppt.rar', '/Users/2. ppt.mp4']
def filterSubstr(lists, substr):
return [x for x in lists if substr in x]
searchString = 'mp4'
print(filterSubstr(lists, searchString))
Result:
['/Users/me/1. intro.mp4', '/Users/2. ppt.mp4']
CodePudding user response:
I would suggest using the pathlib
module which makes it easy to actually check the file's extension — which is a more rigorous test than merely whether the one string is a substring of another:
from pathlib import Path
file_paths = ['/Users/me/1. intro.mp4', '/Users/2. intro.vtt', '/Users/1. ppt.rar',
'/Users/2. ppt.mp4']
def filter_on_extension(paths, ext):
return [path for path in paths if Path(path).suffix == ext]
file_extension = '.mp4'
result = filter_on_extension(file_paths, file_extension)
print(result) # -> ['/Users/me/1. intro.mp4', '/Users/2. ppt.mp4']