Home > Blockchain >  Iterate Through all Folders in a Drive - A Legacy Storage Option Migration to Cloud
Iterate Through all Folders in a Drive - A Legacy Storage Option Migration to Cloud

Time:12-02

I have a folder structure similar to the following:

Folder Structure

This structure is used to store images. New images are appended to the deepest available directory. A directory can hold a maximum of 100 images.

Examples:

The first 100 images added will have the path:

  • X:\Images\DB\0\0\0\0\0\0\image_name.jpg

A random image may have the path:

  • X:\Images\DB\0\2\1\4\2\7\image_name.jpg

The last 100 images added will have the path:

  • X:\Images\DB\0\9\9\9\9\9\image_name.jpg

N.B. An image is only ever stored at the deepest possible directory.

  • X:\Images\DB\0\x\x\x\x\x\IMAGES_HERE

E.G. There are no images stored in: X:\Images\DB\0\1\2\3

N.B. The deepest folder path to an image only exists if an image is stored there. Example:

  • X:\Images\DB\0\9\9\9\9\9

... may not exist (and it doesn't in my case).

What I want to achieve is, beginning at the root directory, navigate through every possible path to the images and run a command.

I'm aware the time complexity for this is in terms of hours, if not days. It's a legacy storage option with the command migrating images to the cloud.

I have already managed to code some functions to allow me to travel to the current deepest directory and execute a command, but visiting all possible paths adds a complexity I'm struggling with - also I'm new to Python.

Here is the code:

# file generator
def files(path):
    for file in os.listdir(path):
        if os.path.isfile(os.path.join(path, file)):
            yield file

# latest deepest directory
def get_deepest_dir(dir):
    current_dir = dir
    next_dir = os.listdir(current_dir)[-1]

    if len(list(files(current_dir))) == 0:
        next_dir = os.path.join(current_dir, next_dir)
        return get_deepest_dir(next_dir)
    else:
        return current_dir

# perform command
def sync():
    dir = get_deepest_dir(root_dir)
    command = "<command_here>"
    subprocess.Popen(command, shell=True)

CodePudding user response:

I used the following to search for csv / pdf files. I've left an example of what I wrote to search through all folders.

os.listdir - os.listdir() method in python is used to get the list of all files and directories in the specified directory.

os.walk - os.walk() method, in python is used to generate the file names in a directory tree by walking the tree either top-down or bottom-up.

#Import Python Modules
import os,time
import pandas as pd

## Search Folder
##src_path ="/Users/folder1/test/"
src_path ="/Users/folder1/"
path = src_path

files = os.listdir(path)

for f in files:
    if f.endswith('.csv'):
        print(f)

for root, directories, files in os.walk(path, topdown=False):
    for name in files:
        if name.endswith('.csv'):
            print(os.path.join(root, name))
        ##    for name in directories:
        ## print(os.path.join(root, name))

for root, directories, files in os.walk(path):
    for name in files:
        if name.endswith('.pdf'):
            print(os.path.join(root, name))
        ## for name in directories:
        ## print(os.path.join(root, name))

CodePudding user response:

Thanks to @NeoTheNerd above for the solution. The adapted code which worked for me is here.

def all_dirs(path):
    for root, directories, files in os.walk(path, topdown=False):
        if sum(c.isdigit() for c in root) == 6:
            print("Migrating Images From {}".format(root))

all_dirs("X:\\Images\\DB\\0")
  • Related