I have a list of dictionaries that have an "index" and a "weight" value. I want to average the dictionaries based on any unique index. So, with the below example, how can I find the average weight for any given index (e.g. 0, 1, 250, etc.)? There will be 8 total elements for each index.
values = [
{'index': 0, 'weight': 0.5},
{'index': 1, 'weight': 0.5},
{'index': 0, 'weight': 0.5},
{'index': 1, 'weight': 0.5},
{'index': 0, 'weight': 0.0},
{'index': 1, 'weight': 1.0},
{'index': 0, 'weight': 0.0},
{'index': 1, 'weight': 1.0},
{'index': 0, 'weight': 0.0},
{'index': 1, 'weight': 1.0},
{'index': 0, 'weight': 1.0},
{'index': 1, 'weight': 0.0},
{'index': 0, 'weight': 1.0},
{'index': 1, 'weight': 0.0},
{'index': 0, 'weight': 1.0},
{'index': 1, 'weight': 0.0}
]
I know I can get the average weight for the whole list using the following code, but I'm not sure how to do this per unique index:
print(sum(v['weight'] for v in values ) / len(values))
CodePudding user response:
You need to group weights by index. defaultdict from the built-in collections module is useful here.
from collections import defaultdict
total = defaultdict(int)
cnts = defaultdict(int)
for d in values:
# add weights
total[d['index']] = d['weight']
# count indexes
cnts[d['index']] = 1
# find the mean
[{'index': k, 'mean weight': total[k]/cnts[k]} for k in total]
# [{'index': 0, 'mean weight': 0.5}, {'index': 1, 'mean weight': 0.5}]
CodePudding user response:
I would recommend using pandas
for this task. Simply create your dataframe by passing your list of dictionary objects to the DataFrame()
constructor and then perform a groupby()
and mean()
calculation:
avgs = pd.DataFrame(values).groupby('index').mean()
Yields:
weight
index
0 0.5
1 0.5
CodePudding user response:
Using only Python
def compute_avg(l, index):
count = 0
value = 0
for data in l:
if data['index'] == index:
count = 1
value = data['weight']
return value/count
CodePudding user response:
You can get all the values with your given index like this:
with_index = [v for v in values if v['index'] == given_index]
Then call this to show the average weight.
print(sum(v['weight'] for v in with_index ) / len(values))
CodePudding user response:
Loop through the values and track in real time:
x = {}
for v in values:
try:
x[v['index']]['weight'] = v['weight']
except KeyError:
x[v['index']] = {'weight' : v['weight']}
try:
x[v['index']]['count'] = 1
except KeyError:
x[v['index']].update({'count':1})
#or wait until after the loop to calculate
#allows for continuation in a streaming situation.
avg = x[v['index']]['weight'] / x[v['index']]['count']
x[v['index']].update({'avg': avg})
print(x)
CodePudding user response:
Use Mean from the pure Python Statistics library
We can use statistics.mean to solve the problem:
from statistics import mean
average_weight = {
index: mean(v['weight'] for v in values if v['index'] == index)
for index in set(v['index'] for v in values)
}
On test values
values = [
{'index': 0, 'weight': 0.5},
{'index': 1, 'weight': 0.5},
{'index': 0, 'weight': 0.5},
{'index': 1, 'weight': 0.5},
{'index': 0, 'weight': 0.0},
{'index': 1, 'weight': 1.0},
{'index': 0, 'weight': 0.0},
{'index': 1, 'weight': 1.0},
{'index': 0, 'weight': 0.0},
{'index': 1, 'weight': 1.0},
{'index': 0, 'weight': 1.0},
{'index': 1, 'weight': 0.0},
{'index': 0, 'weight': 1.0},
{'index': 1, 'weight': 0.0},
{'index': 0, 'weight': 1.0},
{'index': 1, 'weight': 0.0}
]
the average_weight
is
{0: 0.5, 1: 0.5}
Use known info
If you know that there will be 8 total elements for each index, why not to use it?
COUNT = 8
average_weight = {
index: sum(v['weight'] for v in values if v['index'] == index) / COUNT
for index in set(v['index'] for v in values)
}
CodePudding user response:
indexes = set([v['index'] for v in values])
for i in indexes:
print(sum(v['weight'] for v in values if v['index'] == i ) / sum([v['index'] == i for v in values]))
This is a variation of your code. It uses type conversion to count the number of dictionaries with each index.