Home > Software engineering >  How to group file names in a directory using the file name extensions with python?
How to group file names in a directory using the file name extensions with python?

Time:06-05

I am trying to group files in a directory according to their extensions using a dictionary but my code is not behaving as expected. I have several video files ending with .mp4, I have a script that gets the extension of the file and then checks if it exists as a key in the dictionary.

NB:

Dictionary holds extensions as keys and all the files with the extension as items.

If extension exists as a key in the dictionary then it adds the current file name to the items associated with that key. If it does not exist then a new key entry is created in the dictionary with null items. My code is printing the elements of the dictionary like below

'.mp4': 'Edited_20220428_154134.mp4', '.png': 'folder_icon.png',

You can see from the above output that my code does has the extensions as keys but for the videos it only contains a single item when there are several videos in that folder, I need help to make it key the extensions and add all the file names with that extension to the items associated with that key. Below is my code

#import the os module
import os
# define the path to the documents folder
path = "C:\\Users\\USER\\Documents"
# list all the files in the directory
files = os.listdir(path)
# sort the files lexicographically
files.sort()
# code to ask user to choose file operation, in this context user chooses group by extension
# code omitted for MRE purposes
print("Grouping the files by extension")
# initialize the dictionary for holding the extension names
extensions = {}
# iterate through each file name and split the name to get file name and extension as a tuple
for file in files:
    ext = os.path.splitext(file)[1]
    if ext not in extensions.keys():  # if the key extension does not exist add the key to the dict as a new entry
       extensions[ext] = file
    else:  # if the extension already append the item to the key items
        extensions[ext] = file
#print the dict to check if the operation was succesful
print(extensions)

CodePudding user response:

Try this:

for file in files:
    ext = os.path.splitext(file)[1]
    if ext not in extensions.keys():  
        extensions[ext] = [file]
    else:
        extensions[ext].append(file)

You're overwriting all of your files if the extension is found with extensions[ext] = file. the = is setting that dictionary item to the value of one singular file.

In my code above, you create a list the first time the extension is found. And every time after that you add to the list.

CodePudding user response:

By doing

if ext not in extensions.keys():
   extensions[ext] = file
else:  # if the extension already append the item to the key items
    extensions[ext] = file

you are overwritig and not appending the value in the dictionary. Try to use:

if ext not in extensions.keys():
   extensions[ext] = file
else:  # if the extension already append the item to the key items
    extensions[ext]  = [file]

using the = operator to append the desired file names.

CodePudding user response:

try this if you want to group file names that are similar in a directory

    import glob
    x=glob.glob(_filepath_)

    dictionary={}
  • Related