Home > Enterprise >  Finding average value in list of dictionaries based on another unique value
Finding average value in list of dictionaries based on another unique value

Time:05-24

I have a list of dictionaries that have an "index" and a "weight" value. I want to average the dictionaries based on any unique index. So, with the below example, how can I find the average weight for any given index (e.g. 0, 1, 250, etc.)? There will be 8 total elements for each index.

values = [
{'index': 0, 'weight': 0.5},
{'index': 1, 'weight': 0.5},
{'index': 0, 'weight': 0.5},
{'index': 1, 'weight': 0.5},
{'index': 0, 'weight': 0.0},
{'index': 1, 'weight': 1.0},
{'index': 0, 'weight': 0.0},
{'index': 1, 'weight': 1.0},
{'index': 0, 'weight': 0.0},
{'index': 1, 'weight': 1.0},
{'index': 0, 'weight': 1.0},
{'index': 1, 'weight': 0.0},
{'index': 0, 'weight': 1.0},
{'index': 1, 'weight': 0.0},
{'index': 0, 'weight': 1.0},
{'index': 1, 'weight': 0.0}
]

I know I can get the average weight for the whole list using the following code, but I'm not sure how to do this per unique index:

print(sum(v['weight'] for v in values ) / len(values))

CodePudding user response:

You need to group weights by index. defaultdict from the built-in collections module is useful here.

from collections import defaultdict
total = defaultdict(int)
cnts = defaultdict(int)
for d in values:
    # add weights
    total[d['index']]  = d['weight']
    # count indexes
    cnts[d['index']]  = 1
# find the mean
[{'index': k, 'mean weight': total[k]/cnts[k]} for k in total]
# [{'index': 0, 'mean weight': 0.5}, {'index': 1, 'mean weight': 0.5}]

CodePudding user response:

I would recommend using pandas for this task. Simply create your dataframe by passing your list of dictionary objects to the DataFrame() constructor and then perform a groupby() and mean() calculation:

avgs = pd.DataFrame(values).groupby('index').mean()

Yields:

       weight
index
0         0.5
1         0.5

CodePudding user response:

Using only Python

def compute_avg(l, index):
    count = 0
    value = 0
    for data in l:
        if data['index'] == index:
            count  = 1
            value  = data['weight']

    return value/count

CodePudding user response:

You can get all the values with your given index like this:

with_index = [v for v in values if v['index'] == given_index]

Then call this to show the average weight.

print(sum(v['weight'] for v in with_index ) / len(values))

CodePudding user response:

Loop through the values and track in real time:

x = {}
for v in values:
    try:
        x[v['index']]['weight']  = v['weight']
    except KeyError:
        x[v['index']] = {'weight' : v['weight']}
    try:
        x[v['index']]['count']  = 1
    except KeyError:
        x[v['index']].update({'count':1})

    #or wait until after the loop to calculate
    #allows for continuation in a streaming situation. 

    avg = x[v['index']]['weight'] / x[v['index']]['count']
    x[v['index']].update({'avg': avg})
    
print(x)

CodePudding user response:

Use Mean from the pure Python Statistics library

We can use statistics.mean to solve the problem:

from statistics import mean

average_weight = {
    index: mean(v['weight'] for v in values if v['index'] == index) 
    for index in set(v['index'] for v in values)
}

On test values

values = [
    {'index': 0, 'weight': 0.5},
    {'index': 1, 'weight': 0.5},
    {'index': 0, 'weight': 0.5},
    {'index': 1, 'weight': 0.5},
    {'index': 0, 'weight': 0.0},
    {'index': 1, 'weight': 1.0},
    {'index': 0, 'weight': 0.0},
    {'index': 1, 'weight': 1.0},
    {'index': 0, 'weight': 0.0},
    {'index': 1, 'weight': 1.0},
    {'index': 0, 'weight': 1.0},
    {'index': 1, 'weight': 0.0},
    {'index': 0, 'weight': 1.0},
    {'index': 1, 'weight': 0.0},
    {'index': 0, 'weight': 1.0},
    {'index': 1, 'weight': 0.0}
]

the average_weight is

{0: 0.5, 1: 0.5}

Use known info

If you know that there will be 8 total elements for each index, why not to use it?

COUNT = 8

average_weight = {
    index: sum(v['weight'] for v in values if v['index'] == index) / COUNT
    for index in set(v['index'] for v in values)
}

CodePudding user response:

indexes = set([v['index'] for v in values])
for i in indexes:
  print(sum(v['weight'] for v in values if v['index'] == i ) / sum([v['index'] == i for v in values]))

This is a variation of your code. It uses type conversion to count the number of dictionaries with each index.

  • Related