Home > OS >  Create nested dictionary from directory
Create nested dictionary from directory

Time:01-01

I have a folder with files named like this

Max_1.wav
Max_2.wav
Max_3.wav
Toto_1.wav
Toto_2.wav
Toto_3.wav
Valtteri_1.wav
Valtteri_2.wav
Valtteri_3.wav

I want to achieve this result

dict = {
    'Max': {
        'Max_1': {'path': 'database/Max_1.wav'},
        'Max_2': {'path': 'database/Max_2.wav'},
        'Max_3': {'path': 'database/Max_3.wav'}
    },

    'Toto': {
        'Toto_1': {'path': 'database/Toto_1.wav'},
        'Toto_2': {'path': 'database/Toto_2.wav'},
        'Toto_3': {'path': 'database/Toto_3.wav'}
    },

    'Valtteri': {
        'Valtteri_1': {'path': 'database/Valtteri_1.wav'},
        'Valtteri_2': {'path': 'database/Valtteri_2.wav'},
        'Valtteri_3': {'path': 'database/Valtteri_3.wav'}
    }

}

This is the code I've been working on and the result I'm getting

dict = {}

for x in os.listdir(directory):
    dict[x[:-6]] = {}

for x in os.listdir(directory):
    for key in dict:
        if x[:-6] == key:
            dict[key] = {x[:-4]: {'path': f'database/{x}'}}

Which gives me this. I think it's because the keys are overwritten because of the for loop, but I can't seem to wrap my head around a solution.

dict = {
    'Max': {
        'Max_3': {'path': 'database/Max_3.wav'}
    },

    'Toto': {
        'Toto_3': {'path': 'database/Toto_3.wav'}
    },

    'Valtteri': {
        'Valtteri_3': {'path': 'database/Valtteri_3.wav'}
    }

}

Help would be appreciated thank you

CodePudding user response:

@tozCSS's advice solves your problem. I suggest an alternative way, which may be useful in the future (and doesn't use double loop): You can also use dict.setdefault method:

out = {}
for x in os.listdir(directory):
    out.setdefault(x.split('_')[0], {}).setdefault(x.split('.')[0], {}).update({'path': f'database/{x}'})

Output:

{'Max': {'Max_1': {'path': 'database/Max_1.wav'},
  'Max_2': {'path': 'database/Max_2.wav'},
  'Max_3': {'path': 'database/Max_3.wav'}},
 'Toto': {'Toto_1': {'path': 'database/Toto_1.wav'},
  'Toto_2': {'path': 'database/Toto_2.wav'},
  'Toto_3': {'path': 'database/Toto_3.wav'}},
 'Valtteri': {'Valtteri_1': {'path': 'database/Valtteri_1.wav'},
  'Valtteri_2': {'path': 'database/Valtteri_2.wav'},
  'Valtteri_3': {'path': 'database/Valtteri_3.wav'}}}

CodePudding user response:

instead of using the assignment operator here:

dict[key] = {x[:-4]: {'path': f'database/{x}'}}

call the update method on the dictionary:

dict[key].update({x[:-4]:{'path': f'database/{x}'}})

CodePudding user response:

You can use defaultdict from the collections module to skip updating, and skip listing your directory twice:

import pprint
from collections import defaultdict

# Standing in for your file tree
file_names = [
    'Max_1.wav',
    'Max_2.wav',
    'Max_3.wav',
    'Toto_1.wav',
    'Toto_2.wav',
    'Toto_3.wav',
    'Valtteri_1.wav',
    'Valtteri_2.wav',
    'Valtteri_3.wav',
]

results = defaultdict(dict)

for fname in file_names:
    top_key = fname[:-6]
    file_key = fname[:-4]
    results[top_key][file_key] = {'path': f'database/{fname}'}

pprint.pprint(dict(results))
{'Max': {'Max_1': {'path': 'database/Max_1.wav'},
         'Max_2': {'path': 'database/Max_2.wav'},
         'Max_3': {'path': 'database/Max_3.wav'}},
 'Toto': {'Toto_1': {'path': 'database/Toto_1.wav'},
          'Toto_2': {'path': 'database/Toto_2.wav'},
          'Toto_3': {'path': 'database/Toto_3.wav'}},
 'Valtteri': {'Valtteri_1': {'path': 'database/Valtteri_1.wav'},
              'Valtteri_2': {'path': 'database/Valtteri_2.wav'},
              'Valtteri_3': {'path': 'database/Valtteri_3.wav'}}}
  • Related