How to save a batch of processed images to a specific folder-CodePudding

I am trying to process a raw dataset sufolder and save the output images to a destination folder.

So for example:

Raw dataset in video_0001 should be saved in destination directory with the same folder name as video_0001

I tried the following code but since the dataset contain over 200 folders

here's what I was able to come up with

directory = "C:\\Users\\dataset\\distdir"
save_directory1 = "C:\\Users\\Desktop\\dataset\\distdir\\save_img\\Folder1"
save_directory2 = "C:\\Users\\Desktop\\dataset\\distdir\\save_img\\Folder2"

height = 512
width = 512


for root, dirs, files in os.walk(directory):
      for folder_name in dirs:
         cv2.imwrite(os.path.join(save_directory1, file), img)
         cv2.imwrite(os.path.join(save_directory2, file), img)  
      for file in files: 
         img = cv2.imread(os.path.join(root,file))
         print(img)
         img = cv2.resize(img, (height, width))
         img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
         img = cv2.normalize(img, None, 0, 255, cv2.NORM_MINMAX)
         print(img.shape)

But the output is that in folder 1 and folder 2 has the last image only and this solution is not feasible as I have over 200 folders.

Any thoughts would be appreciated

CodePudding user response：

You need to combine the two for loops. Otherwise, you'll overwrite the variables img and file again and again and will only save the last processed img.

Try:

directory = "C:\\Users\\dataset\\distdir"

for root, dirs, files in os.walk(directory):
    for file in files:
        img = cv2.imread(os.path.join(root, file))
        img = cv2.resize(img, (height, width))
        img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
        img = cv2.normalize(img, None, 0, 255, cv2.NORM_MINMAX)
        folder = os.path.splitext(file)[0]
        end_of_path = os.path.join(folder, file)
        cv2.imwrite(os.path.join(directory, end_of_path), img)

CodePudding user response：

You can define your preprocess logic first.

Walk through the origin folder to get the folder name
Create a destination folder based on the origin folder
Loop all files inside that origin folder
Do the image preprocess steps
Save the image to that created destination folder

For example, this is the tree directory of your origin folder

-dataset
- -folder_0001
- - -image_01.jpeg
- - -image_02.jpeg
- -folder_0002
- - -image_01.jpeg
- - -image_02.jpeg
...
- -folder_0200
- - -image_01.jpeg
- - -image_02.jpeg

import os
import cv2

input_dir = 'dataset'
output_dir = 'output'

height = 512
width = 512

for root, dirs, files in os.walk(input_dir):
    for file in files:
        # Create output directory
        output_dir_path = os.path.join(output_dir, os.path.split(os.path.dirname(os.path.join(root, file)))[-1])
        if not os.path.exists(output_dir_path):
            os.makedirs(output_dir_path)

        # Do preprocessing image
        img = cv2.imread(os.path.join(root, file))
        img = cv2.resize(img, (height, width))
        img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
        img = cv2.normalize(img, None, 0, 255, cv2.NORM_MINMAX)

        # Save it to the output directory
        cv2.imwrite(os.path.join(output_dir_path, file), img)

With this solution you able to handle 200 subfolders