I have a folder structure similar to the following:
This structure is used to store images. New images are appended to the deepest available directory. A directory can hold a maximum of 100 images.
Examples:
The first 100 images added will have the path:
- X:\Images\DB\0\0\0\0\0\0\image_name.jpg
A random image may have the path:
- X:\Images\DB\0\2\1\4\2\7\image_name.jpg
The last 100 images added will have the path:
- X:\Images\DB\0\9\9\9\9\9\image_name.jpg
N.B. An image is only ever stored at the deepest possible directory.
- X:\Images\DB\0\x\x\x\x\x\IMAGES_HERE
E.G. There are no images stored in: X:\Images\DB\0\1\2\3
N.B. The deepest folder path to an image only exists if an image is stored there. Example:
- X:\Images\DB\0\9\9\9\9\9
... may not exist (and it doesn't in my case).
What I want to achieve is, beginning at the root directory, navigate through every possible path to the images and run a command.
I'm aware the time complexity for this is in terms of hours, if not days. It's a legacy storage option with the command migrating images to the cloud.
I have already managed to code some functions to allow me to travel to the current deepest directory and execute a command, but visiting all possible paths adds a complexity I'm struggling with - also I'm new to Python.
Here is the code:
# file generator
def files(path):
for file in os.listdir(path):
if os.path.isfile(os.path.join(path, file)):
yield file
# latest deepest directory
def get_deepest_dir(dir):
current_dir = dir
next_dir = os.listdir(current_dir)[-1]
if len(list(files(current_dir))) == 0:
next_dir = os.path.join(current_dir, next_dir)
return get_deepest_dir(next_dir)
else:
return current_dir
# perform command
def sync():
dir = get_deepest_dir(root_dir)
command = "<command_here>"
subprocess.Popen(command, shell=True)
CodePudding user response:
I used the following to search for csv / pdf files. I've left an example of what I wrote to search through all folders.
os.listdir - os.listdir() method in python is used to get the list of all files and directories in the specified directory.
os.walk - os.walk() method, in python is used to generate the file names in a directory tree by walking the tree either top-down or bottom-up.
#Import Python Modules
import os,time
import pandas as pd
## Search Folder
##src_path ="/Users/folder1/test/"
src_path ="/Users/folder1/"
path = src_path
files = os.listdir(path)
for f in files:
if f.endswith('.csv'):
print(f)
for root, directories, files in os.walk(path, topdown=False):
for name in files:
if name.endswith('.csv'):
print(os.path.join(root, name))
## for name in directories:
## print(os.path.join(root, name))
for root, directories, files in os.walk(path):
for name in files:
if name.endswith('.pdf'):
print(os.path.join(root, name))
## for name in directories:
## print(os.path.join(root, name))
CodePudding user response:
Thanks to @NeoTheNerd above for the solution. The adapted code which worked for me is here.
def all_dirs(path):
for root, directories, files in os.walk(path, topdown=False):
if sum(c.isdigit() for c in root) == 6:
print("Migrating Images From {}".format(root))
all_dirs("X:\\Images\\DB\\0")