Home > Enterprise >  Average of values with same key in a nested dictionary in python
Average of values with same key in a nested dictionary in python

Time:11-10

Hi I am learning about nested dictionaries in python. Here I am trying to calculate the average of values that have the same key, this is my dictionary,

avg = {'AKL': [{'Fhr': '19:30', 'Dhr': '39:25', 'Thr': '141:00'}, 
               {'Fhr': '58:10', 'Dhr': '130:35', 'Thr': '414:25'}, 
               {'Fhr': '7:30', 'Dhr': '18:25', 'Thr': '30:40'}, 
               {'Fhr': '7:00', 'Dhr': '14:15', 'Thr': '26:30'}], 
       'CHC': [{'Fhr': '33:10', 'Dhr': '62:20', 'Thr': '157:20'}, 
               {'Fhr': '51:55', 'Dhr': '101:40', 'Thr': '263:55'}]}

I have tried to calculate the averages like this:

result_dict = dict([(k, sum(average_dict[k].items())) for k, v in average_dict.items()])

This is the error I'm getting currently,

AttributeError: 'list' object has no attribute 'items'

I want the result like this,

{'AKL': {'Avg Fhr': 28.23, 'Avg Dhr': 187.85, 'Avg Thr': 195.21},
 'CHC': {'Avg Fhr': 42.32, 'Avg Dhr': 81.8, 'Avg Thr': 210.37}
}

CodePudding user response:

You can use nested loops and collections.defaultdict:

from collections import defaultdict

def time2float(s):
    a,b = s.split(':')
    return int(a) int(b)/60

out = {}
for k,l in avg.items(): 
    d = defaultdict(lambda : 0)
    for e in l:
        for k2,v2 in e.items():
            d[k2]  = time2float(v2)
    out[k] = {k:round(v/len(l),2) for k,v in d.items()}

output:

{'AKL': {'Fhr': 23.04, 'Dhr': 50.67, 'Thr': 153.15},
 'CHC': {'Fhr': 42.54, 'Dhr': 82.0, 'Thr': 210.62}}

input:

avg = {'AKL': [{'Fhr': '19:30', 'Dhr': '39:25', 'Thr': '141:00'}, 
               {'Fhr': '58:10', 'Dhr': '130:35', 'Thr': '414:25'}, 
               {'Fhr': '7:30', 'Dhr': '18:25', 'Thr': '30:40'}, 
               {'Fhr': '7:00', 'Dhr': '14:15', 'Thr': '26:30'}], 
       'CHC': [{'Fhr': '33:10', 'Dhr': '62:20', 'Thr': '157:20'}, 
               {'Fhr': '51:55', 'Dhr': '101:40', 'Thr': '263:55'}]}

output as time strings:

from collections import defaultdict

def time2float(s):
    a,b = s.split(':')
    return int(a) int(b)/60

def float2time(i):
    a,b = divmod(i, 1)
    return f'{int(a)}:{round(b*60):02}'

out = {}
for k,l in avg.items(): 
    d = defaultdict(lambda : 0)
    for e in l:
        for k2,v2 in e.items():
            d[k2]  = time2float(v2)
    out[k] = {k:float2time(v/len(l)) for k,v in d.items()}

output:

{'AKL': {'Fhr': '23:02', 'Dhr': '50:40', 'Thr': '153:09'},
 'CHC': {'Fhr': '42:32', 'Dhr': '82:00', 'Thr': '210:38'}}

CodePudding user response:

Try taking a look at your sum(avg[k].items()) bit -- that will need replaced with a list comprehension of list_elem.items(). Can you figure out why/how?

CodePudding user response:

You can first create dict with all values then compute mean for each list of values like below:

>>> dct = {}
>>> for k,v in avg.items():
...    tmp_dct ={} 
...    for i in v:
...        for j,w, in i.items():
...            tmp_dct.setdefault(j,[]).append(w)
...    dct[k] = tmp_dct
    
>>> dct
{'AKL': {'Dhr': [39.25, 130.35, 18.25, 14.15],
         'Fhr': [19.3, 58.1, 7.3, 7.0],
         'Thr': [141.0, 414.25, 30.4, 26.3]},
 'CHC': {'Dhr': [62.2, 101.4], 
         'Fhr': [33.1, 51.55], 
         'Thr': [157.2, 263.55]}}

>>> {k: {i:sum(j)/len(j) for i,j in v.items()} for k,v in dct.items() }
{'AKL': {'Fhr': 22.925, 'Dhr': 50.5, 'Thr': 152.98749999999998},
 'CHC': {'Fhr': 42.325, 'Dhr': 81.80000000000001, 'Thr': 210.375}}

CodePudding user response:

With the assumption that all sub dicts have the same amount and same keys. You can use statistics.mean and a dict comprehension:

from statistics import mean

avg = {'AKL': [{'Fhr': 19.30, 'Dhr': 39.25, 'Thr': 141.00}, {'Fhr': 58.10, 'Dhr': 130.35, 'Thr': 414.25}, {'Fhr': 7.30, 'Dhr': 18.25, 'Thr': 30.40}, {'Fhr': 7.00, 'Dhr': 14.15, 'Thr': 26.30}], 
       'CHC': [{'Fhr': 33.10, 'Dhr': 62.20, 'Thr': 157.20}, {'Fhr': 51.55, 'Dhr': 101.40, 'Thr': 263.55}]}

result_dict = {key: dict(zip(vals[0], map(mean, zip(*map(dict.values, vals)))))
               for key, vals in avg.items()}

Result:

{'AKL': {'Fhr': 22.925, 'Dhr': 50.5, 'Thr': 152.9875}, 'CHC': {'Fhr': 42.325, 'Dhr': 81.80000000000001, 'Thr': 210.375}}
  • Related