I have an image dataset that looks like this: Dataset
The timestep of each image is 15 minutes (as you can see, the timestamp is in the filename).
Now I would like to group those images in 3hrs long sequences and save those sequences inside subfolders that would contain respectively 12 images(=3hrs). The result would ideally look like this: Sequences
I have tried using os.walk
and loop inside the folder where the image dataset is saved, then I created a dataframe using pandas because I thought I could handle the files more easily but I think I am totally off target here.
CodePudding user response:
The timestep of each image is 15 minutes (as you can see, the timestamp is in the filename).
Now I would like to group those images in 3hrs long sequences and save those sequences inside subfolders that would contain respectively 12 images(=3hrs)
I suggest exploiting datetime
built-in libary to get desired result, for each file you have
- get substring which is holding timestamp
- parse it into
datetime.datetime
instance usingdatetime.datetime.strptime
- convert said instance into seconds since epoch using
.timestamp
method - compute number of seconds integer division (
//
)10800
(number of seconds inside 3hr) - convert value you got into
str
and use it as target subfolder name
CodePudding user response:
Since you said you need only 12 files (considering that the timestamp is the same for all of them and 12 is the exact number you need, the following code can help you
import os
import shutil
output_location = "location where you want to save them" # better not to be in the same location with the dataset
dataset_path = "your data set"
files = [os.path.join(path, file) for path, subdirs, files in os.walk(dataset_path) for file in files]
nr_of_files = 0
folder_name = ""
for index in range(len(files)):
if nr_of_files == 0:
folder_name = os.path.join(output_location, files[index].split("\\")[-1].split(".")[0])
os.mkdir(folder_name)
shutil.copy(files[index], files[index].replace(dataset_path, folder_name))
nr_of_files = 1
elif nr_of_files == 11:
shutil.copy(files[index], files[index].replace(dataset_path, folder_name))
nr_of_files = 0
else:
shutil.copy(files[index], files[index].replace(dataset_path, folder_name))
nr_of_files = 1
Explaining the code:
files
takes value of all files in the dataset_path
. You set this variable and files
will contain the entire path to all files.
for
loop interating for the entire length of files
.
Used nr_of_files
to count each 12 files. If it's 0, it will create a folder with the name of files[index]
to the location you set as output, will copy the file (replacing the input path with the output path)
If it's 11 (starting from 0, index == 11 means 12th file) will copy the file and set nr_of_files
back to 0 to create another folder
Last else
will simply copy the file and increment nr_of_files