Code: https://pastebin.com/GJn74YX5
I have a nested dictionary with the data structure:
foo[count] [string variable] = value
foo = { 0 : { "a": 2, "b": 5, "c": 6},
1 : { "a": 3, "b": 8, "d": 9},
2 : { "b": 5, "d": 9, "c": 3}}
I want to take common values of these dicts and combine common values to the appropriate key variable. Perhaps doable in a new dict. If the common variable is not available for that specific count, it will add a 0. All values should be the same length in the new dict. To to look like this:
{ "a": [2, 3, 0], "b": [5, 8, 5], "c":[6, 0, 3], "d": [0,9,9] }
My approach:
Create newdict a defaultdict(list) so i can check if key exists and append to the list like seen above. This will be the final dict I want
newdict = defaultdict(list)
Create the following for loop to append to the new dict:
for count in foo.keys(): for variable in foo[count].keys(): if variable in foo[count].keys(): newdict[variable].append(foo[count].get(variable)) elif variable not in foo[count].keys(): newdict[variable].append(0) else: newdict[variable] = foo[count].get(variable)
My problem:
Output:
{ "a": [2, 3], "b": [5, 8, 5], "c":[6, 3], "d": [9,9] }
- The newdict seem to merge all the values but it seems to always go toward the first if statement
- The elif block is never reached -- 0 is never appended to the list
- The else block is also never reached but it seems to be appending right so might not be a big deal(?)
I spent hours and cant seem to wrap my ahead why 0 isn't appending. Any help is appreciated. Thank you in advance!
CodePudding user response:
The problem is in these lines of code:
for count in foo.keys():
for variable in foo[count].keys():
You are going through the keys in the order they appear, so for example if "d" only appears in the dictionary related to the second key, it will not have any value in the first position. The problem can be solved by first generating a list of all the keys that might appear afterward, the following code works, however it can be optimized.
from collections import defaultdict
foo = { 0 : { "a": 2, "b": 5, "c": 6},
1 : { "a": 3, "b": 8, "d": 9},
2 : { "b": 5, "d": 9, "c": 3} }
key_list = []
for count in foo.keys():
for variable in foo[count].keys():
if variable not in key_list:
key_list.append(variable)
newdict = defaultdict(list)
for count in foo.keys():
for variable in key_list:
if variable in foo[count].keys():
newdict[variable].append(foo[count].get(variable))
elif variable not in foo[count].keys():
newdict[variable].append(0)
else:
newdict[variable] = foo[count].get(variable)
newdict
CodePudding user response:
Here is a possible alternative approach:
from collections import defaultdict
# directly prepare the correct list size
newdict = defaultdict(lambda: [0]*len(foo))
for count in foo:
for v in foo[count]:
newdict[v][count] = foo[count][v]
Output:
>>> dict(newdict)
{'a': [2, 3, 0],
'b': [5, 8, 5],
'c': [6, 0, 3],
'd': [0, 9, 9]}
If the dictionary index is not 0,1,2... you can use for i count in enumerate(foo)
and i
will be the index for the sublists.
CodePudding user response:
Hope, this may also help:
all_keys = []
newdict = {}
# identify all unique keys
for count in foo.keys():
all_keys.append(list(foo[count].keys()))
all_keys = np.unique(all_keys)
# prepare final dictionary
for i in all_keys:
newdict[i] = []
# collect data
for count in foo.keys():
check_key = set(list(all_keys)).difference(list(foo[count].keys()))
for variable in foo[count].keys():
newdict[variable].append(foo[count][variable])
if check_key:
for key in check_key:
newdict[key].append(0)
newdict