I have some folders which contains a lot of files. They are all build up like this: Name-0000000000.txt Name-0000000001.txt Name-0000000002.txt Name-0000000003.txt
and so on.
There can be 5000000 of files like this in a folder. I want to know now how to find out if there is one or more files missing.
I would like to just check if one consecutive number is missing, but how. I know I can check for the first and last name in that folder:
import glob
import os
list_of_files = glob.glob('K:/path_to_files/*')
first_file = min(list_of_files, key=os.path.getctime)
latest_file = max(list_of_files, key=os.path.getctime)
print(first_file)
print(latest_file)
But I have no clue how to find missing files :(
Anyone have an idea?
CodePudding user response:
I have not tried this code myself but something like this should work:
import glob
import os
list_of_files = glob.glob('K:/path_to_files/*')
first_file = min(list_of_files, key=os.path.getctime)
latest_file = max(list_of_files, key=os.path.getctime)
for i in range(0,5000000): #Put the highest numbered file number here
some_file = "Name-" str(i).zfill(10) ".txt")
if not some_file in list_of_files:
print("file: " some_file " is not in the list.")
This code might need some minor adjustments to work for your specific case but it should be enough to guide you in the correct direction :)
CodePudding user response:
This solution only works if you know that there is only file missing.Take summation of all the file names (after removing the suffix and converting them to integers) and then subtract it from the expected sum. The result is the missing file name.