Home > Mobile >  How to keep at least one file while looping in a directory?
How to keep at least one file while looping in a directory?

Time:08-28

I am doing slice removal of images in folders, but I have to keep at least one image in each foder. I want to modify my code so that at least one image is left in each folder.

The current code is:

count = []
folder_path = "/home/idu/Desktop/COV19D/train-seg3/covid"
# Change this directory to the directory where you need to do preprocessing for images
# Inside the directory must folder(s), which have the images inside them
for fldr in os.listdir(folder_path):
    sub_folder_path = os.path.join(folder_path, fldr)
    for filee in os.listdir(sub_folder_path):
        file_path = os.path.join(sub_folder_path, filee)
        img = cv2.imread(file_path, 0)
        count = np.count_nonzero(
            img
        )  # Counting number of bright pixels in the binarized slices
        # print(count)
        if count > 1500:
            img = np.expand_dims(img, axis=2)
            img = array_to_img(img)
            # Replace images with the image that includes ROI
            img.save(str(file_path), "JPEG")
            # print('saved')
        else:
            # Remove non-representative slices
            os.remove(str(file_path))
            # print('removed')
            # Check that there is at least one slice left
        if not os.listdir(str(sub_folder_path)):
            print(str(sub_folder_path), "Directory is empty")
        count = []

The above code only informs me if a directory is left empty after images removal. I would like to modify the code so that at least one image is left. How can I achieve that?

CodePudding user response:

You need to keep track of how many files are in the directory and how many you’ve removed:

for fldr in os.listdir(folder_path):
    sub_folder_path = os.path.join(folder_path, fldr)
    if os.path.isdir(sub_folder_path): # check what's a dir
        directory = os.listdir(sub_folder_path)
        files_left = len(directory)  # get initial count
        for filee in directory:
            file_path = os.path.join(sub_folder_path, filee)
            if os.path.isfile(file_path): # check what's a file
                img = cv2.imread(file_path, 0)
                count = np.count_nonzero(img)
                if count > 1500:
                    img = np.expand_dims(img, axis=2)
                    img = array_to_img(img)
                    img.save(file_path, "JPEG")
                else:
                    if files_left > 1:  # check if you should remove
                        os.remove(file_path)
                        files_left -= 1
                if not os.listdir(sub_folder_path):
                    print(sub_folder_path, "Directory is empty")

Also, you’re calling str on strings, which is unnecessary.

As @theherk pointed out though, this doesn't work recursively:

$ tree files/
files
├── a
│   ├── aa
│   │   ├── aaf1.png
│   │   └── aaf2.png
│   ├── ab
│   │   ├── abf1.png
│   │   └── abf2.png
│   ├── af1.png
│   └── af2.png
├── b
│   ├── bf1.png
│   └── bf2.png
└── f1.png
$ python3 keepOne.py files/
$ tree files/
files
├── a
│   ├── aa
│   │   ├── aaf1.png
│   │   └── aaf2.png
│   └── ab
│       ├── abf1.png
│       └── abf2.png
├── b
│   └── bf1.png
└── f1.png

If you want to remove the files recursively, you should definitely use os.walk:

for dir_path, _, directory in os.walk(top_path):
    files_left = len(directory)  # get initial count
    for file_name in directory:
        file_path = os.path.join(dir_path, file_name)
        img = cv2.imread(file_path, 0)
        count = np.count_nonzero(img)
        if count > 1500:
            img = np.expand_dims(img, axis=2)
            img = array_to_img(img)
            img.save(file_path, "JPEG")
        else:
            if files_left > 1:  # check if you should remove
                os.remove(file_path)
                files_left -= 1

Which results in:

$ tree files
files
├── a
│   ├── aa
│   │   ├── aaf1.png
│   │   └── aaf2.png
│   ├── ab
│   │   ├── abf1.png
│   │   └── abf2.png
│   ├── af1.png
│   └── af2.png
├── b
│   ├── bf1.png
│   └── bf2.png
└── f1.png
$ python3 keepFile.py files
$ tree files
files
├── a
│   ├── aa
│   │   └── aaf1.png
│   ├── ab
│   │   └── abf1.png
│   └── af1.png
├── b
│   └── bf1.png
└── f1.png

CodePudding user response:

I recommend you use os.walk for this task. It also makes use of os.path.join and os.remove.

In this contrived example, we have a folder with nested directories, each containing one or more png files.

ᐅ exa -T -L 4 ./files
./files
├── a
│  ├── aa
│  │  ├── aaf1.png
│  │  └── aaf2.png
│  ├── ab
│  │  ├── abf1.png
│  │  └── abf2.png
│  ├── af1.png
│  └── af2.png
├── b
│  ├── bf1.png
│  └── bf2.png
└── f1.png

We iterate over these, and for each directory of files iterate of the slice of all but one file, deleting the files in the given slice.

import os

for path, _, files in os.walk("./files"):
    for f in files[1:]:
        os.remove(os.path.join(path, f))

Then we are left with directory contents:

ᐅ exa -T -L 4 ./files
./files
├── a
│  ├── aa
│  │  └── aaf2.png
│  ├── ab
│  │  └── abf2.png
│  └── af2.png
├── b
│  └── bf2.png
└── f1.png
  • Related