Home > Enterprise >  Custom sorting a Python list with nested dictionaries
Custom sorting a Python list with nested dictionaries

Time:04-14

I'm trying to sort a list of dictionaries and lists in Python that represents a file structure. I am aiming to have the list sorted so that all folders (dictionaries with a list inside of it) appear first in alphabetical order. I've taken a stab at sorting but run into a KeyError. Does anyone have a recommended solution?

Here is what I currently have:

[
    {
        'file_name': 'abc.txt',
        'endpoint': '/code/d20cb114-b68c-11ec-b468-a063919f3f30/abc.txt'
    }, 
    {
        'src': [
            {
                'file_name': 'jump.sql',
                'endpoint': '/code/d20cb114-b68c-11ec-b468-a063919f3f30/jump.sql'
            },
            {
                'file_name': 'test.txt',
                'endpoint': '/code/d20cb114-b68c-11ec-b468-a063919f3f30/test.txt'
            },
            {
                'file_name': 'tester.txt',
                'endpoint': '/code/d20cb114-b68c-11ec-b468-a063919f3f30/tester.txt'
            }
        ]
    },
    {
        'test': [
            {
                'file_name': 'test.java',
                'endpoint': '/code/d20cb114-b68c-11ec-b468-a063919f3f30/test.java'
            },
            {
                'file_name': 'testerjunit.cpp',
                'endpoint': '/code/d20cb114-b68c-11ec-b468-a063919f3f30/testerj.cpp'
            }
        ]
    },
    {
        'file_name': 'test.log',
        'endpoint': '/code/d20cb114-b68c-11ec-b468-a063919f3f30/test.log'
    }
]

And here is what I am looking to have the sorted output look like:

[
    {
        'src': [
            {
                'file_name': 'jump.sql',
                'endpoint': '/code/d20cb114-b68c-11ec-b468-a063919f3f30/jump.sql'
            },
            {
                'file_name': 'test.txt',
                'endpoint': '/code/d20cb114-b68c-11ec-b468-a063919f3f30/test.txt'
            },
            {
                'file_name': 'tester.txt',
                'endpoint': '/code/d20cb114-b68c-11ec-b468-a063919f3f30/tester.txt'
            }
        ]
    },
    {
        'test': [
            {
                'file_name': 'test.java',
                'endpoint': '/code/d20cb114-b68c-11ec-b468-a063919f3f30/test.java'
            },
            {
                'file_name': 'testerjunit.cpp',
                'endpoint': '/code/d20cb114-b68c-11ec-b468-a063919f3f30/testerj.cpp'
            }
        ]
    },
    {
        'file_name': 'abc.txt',
        'endpoint': '/code/d20cb114-b68c-11ec-b468-a063919f3f30/abc.txt'
    }, 
    {
        'file_name': 'test.log',
        'endpoint': '/code/d20cb114-b68c-11ec-b468-a063919f3f30/test.log'
    }
]

I attempted to use a lambda function to sort by the key file_name, but that gives me a KeyError as the key is not directly in every dict.

res.sort(key=lambda e: e['file_name'], reverse=True)

Where res is the list object.

Anyone know of a better way to go about doing this?

TIA!

CodePudding user response:

You could do the following:

folders, files = [], []

for obj in res:
    if len(obj) == 1:
        folders.append(obj)
    else:
        files.append(obj)

folders.sort(key=lambda e: next(iter(e.keys())))
files.sort(key=lambda e: e['file_name'])
res = folders   files

This code simply separates the objects into two separate lists (where it assumes that every entry that represents a folder is an object of length one), then sorts both lists separately and finally concatenates them. The folders list is sorted based on the keys (folder names) of the single entries in the dictionaries (folder objects). Note that this approach also sorts the files that are not in folders, which could easily be avoided by removing the line files.sort(key=lambda e: e['file_name'])). Also note that this does not sort the files within folders, which could be achieved by adding the following code:

for folder in folders:
    folder_name, file_names = next(iter(folder.items()))
    folder[folder_name] = sorted(file_names, key=lambda e: e['file_name'])

Edit: The following function puts all this together and also allows arbitrary nesting levels:

def sort_objects(objects):
    folders = list(filter(lambda o: len(o) == 1, objects))
    files = list(filter(lambda o: len(o) != 1, objects))
    for folder in folders:
        name, inner_objects = next(iter(folder.items()))
        folder[name] = sort_objects(inner_objects)
    sorted_folders = sorted(folders, key=lambda e: next(iter(e.keys())))
    sorted_files = sorted(files, key=lambda e: e['file_name'])
    return sorted_folders   sorted_files
    
res = sort_objects(res)
  • Related