Home > Software engineering >  Rename files as the ID from the directory name, many nested files
Rename files as the ID from the directory name, many nested files

Time:02-26

I would like to rename by files that end with .txt which are nested deeply within a set of files, based on the name of the directory but taking ONLY that before the "_" as it is the ID and thus creating "ID_TreatedSubject.txt" file

This is what I have to start: (it is a longer list, FYI)

/Users/Owner/Desktop/test/Blood2/4BA(ID)_Kidney_Blood_CEL-archive/Processed/FP004BA Experiment Group (Blood, RMA)/TreatedSubject.txt

/Users/Owner/Desktop/test/Blood2/4BA(ID)_Kidney_Blood_CEL-archive/Processed/FP004BA Experiment Group (Kidney, RMA)/TreatedSubject.txt

Desired output:

/Users/Owner/Desktop/test/Blood2/4BA(ID)_Kidney_Blood_CEL-archive/Processed/FP004BA Experiment Group (Blood, RMA)/4BA_TreatedSubject.txt

/Users/Owner/Desktop/test/Blood2/4BA(ID)_Kidney_Blood_CEL-archive/Processed/FP004BA Experiment Group (Kidney, RMA)/4BA_TreatedSubject.txt

This is the attempt:

import os
def list_files(dir):
    sub_folders = os.listdir(dir)
#     print(sub_folders)
    for sub_folder in sub_folders:
         sub_folder_path = os.path.join(dir,sub_folder)
         for root, dirs, files in os.walk(sub_folder_path):
#             print(type(root))
            for file in files:
                if file.endswith(".txt"):
                    a_string = root
                    partitioned_string = a_string.partition('_')
                    print(partitioned_string)
                    root = before_first_period = partitioned_string[0] 
                    new_filename = sub_folder   file
#                         print(new_filename)
                    os.rename(os.path.join(root, file), os.path.join(root, new_filename))
input_dir = "/Users/Owner/Desktop/test/Blood2/"
assert os.path.isdir(input_dir),"Enter a valid directory path which consists of sub-directories"
list_files(input_dir)

This is the error I am getting:

('/Users/Owner/Desktop/test/Blood2/4BA', '_', 'Kidney_Blood_CEL-archive/Processed/FP004BA Experiment Group (Blood, RMA)')
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-128-53f8924d30fa> in <module>
     19 input_dir = "/Users/Owner/Desktop/test/Blood2/"
     20 assert os.path.isdir(input_dir),"Enter a valid directory path which consists of sub-directories"
---> 21 list_files(input_dir)

<ipython-input-128-53f8924d30fa> in list_files(dir)
     16                     new_filename = sub_folder   file
     17 #                         print(new_filename)
---> 18                     os.rename(os.path.join(root, file), os.path.join(root, new_filename))
     19 input_dir = "/Users/Owner/Desktop/test/Blood2/"
     20 assert os.path.isdir(input_dir),"Enter a valid directory path which consists of sub-directories"

FileNotFoundError: [Errno 2] No such file or directory: '/Users/Owner/Desktop/test/Blood2/4BA/TreatedSubject.txt' -> '/Users/Owner/Desktop/test/Blood2/4BA/4BA_Kidney_Blood_CEL-archiveTreatedSubject.txt'

If I remove this block:

a_string = root
partitioned_string = a_string.partition('_')
print(partitioned_string)
root = before_first_period = partitioned_string[0] 

I get an output of:

/Users/Owner/Desktop/test/Blood2/4BA_Kidney_Blood_CEL-archive/Processed/FP004BA Experiment Group (Blood, RMA)/4BA_Kidney_Blood_CEL-archiveTreatedSubject.txt

/Users/Owner/Desktop/test/Blood2/4BA_Kidney_Blood_CEL-archive/Processed/FP004BA Experiment Group (Kidney, RMA)/4BA_Kidney_Blood_CEL-archiveTreatedSubject.txt

I just need it to split from the "_" instead of adding the entire line. Any help will be appreciated, I feel like I am quite close!

CodePudding user response:

os.walk will make this quite a bit simpler.

If I understand this right, the ID is between the first _ and its proceeding / path separator. So we can use .split('/') after getting the relevant part of the string with .partition (.split('_', 1) would also work)

Because os.walk already separates the path from the filename, we can construct a new filename just by prepending the ID and an underscore to the existing filename.

import os

def get_id(fp):
    return fp.partition('_')[0].split('/')[-1]


def rename_files(root_dir):
    for root, dirs, files in os.walk(root_dir):
        for fname in files:
            if fname.endswith('.txt'):
                fp = os.path.join(root, fname)
                id_ = get_id(fp)
                new_fname = f'{id_}_{fname}'
                new_fp = os.path.join(root, new_fname)
                print('renaming', fp, 'to', new_fp)
                # os.rename(fp, new_fp)

rename_files('/Users/Owner/Desktop/test/')
  • Related