In Python3, I have a dictionnary {k = episode : value = count} and I can't figure out how to group by substring of keys where values are summed.
input:
dict = {'S01E01': 27, 'S01E02': 27, 'S01E03': 32, 'S01E04': 36, 'S01E05': 35, 'S01E06': 31,
'S02E01': 33, 'S02E02': 21, 'S02E03': 20, 'S02E04': 29, 'S02E05': 33, 'S02E06': 42}
Wanted ouput:
output_dict = {'S01': 188 , 'S02' : 178}
I've tried building an intermediary list of seasons and tried to use reduce & counter functions with no success.
List = ['S01', 'S02']
Also tried looking for any results in here but couldn't find anything. Wrong terminology probably. Any help would be appreciated. Thanks
CodePudding user response:
The answer by Onyambu is probably the more pythonic way to solve this problem, but if you're looking for a more human readable solution that fits this specific use case then you can do something like this:
episodes = {'S01E01': 27, 'S01E02': 27, 'S01E03': 32, 'S01E04': 36, 'S01E05': 35, 'S01E06': 31,
'S02E01': 33, 'S02E02': 21, 'S02E03': 20, 'S02E04': 29, 'S02E05': 33, 'S02E06': 42}
output = {}
for episode in episodes:
season = episode[0:3] #Gets the first 3 characters
if season not in output:
output[season] = episodes[episode]
else:
output[season] = episodes[episode]
print(output)
CodePudding user response:
Use dict
comprehension:
from itertools import groupby
{key:sum(list(zip(*val))[1]) for key, val in groupby(d.items(), key = lambda x:x[0][:3])}
Out: {'S01': 188, 'S02': 178}
Using the normal for loop first save your data as d
. then delete dict
since its an internal function ie del dict
. Now you can run the following code
result = dict()
for key, val in d.items():
var1 = key[:3]
if not result.get(var1):
result[var1] = 0
result[var1] = val
CodePudding user response:
I am assuming the subkey is only 3 characters long.
dic = {'S01E01': 27, 'S01E02': 27, 'S01E03': 32, 'S01E04': 36, 'S01E05': 35, 'S01E06': 31,
'S02E01': 33, 'S02E02': 21, 'S02E03': 20, 'S02E04': 29, 'S02E05': 33, 'S02E06': 42}
First extract unique subkeys:
subkeys = set([key[:3] for key in dic.keys()])
Afterwards, use a dictionary comprehension to sum up values for each subkey.
out = {subkey: sum([value for key, value in dic.items() if subkey in key]) for subkey in subkeys}
Uglier one-liner:
out = {subkey[:3]: sum([value for key, value in dic.items() if subkey[:3] in key]) for subkey in dic.keys()}
CodePudding user response:
Another approach:
data = {'S01E01': 27, 'S01E02': 27, 'S01E03': 32, 'S01E04': 36, 'S01E05': 35, 'S01E06': 31,
'S02E01': 33, 'S02E02': 21, 'S02E03': 20, 'S02E04': 29, 'S02E05': 33, 'S02E06': 42}
from itertools import groupby
out = {}
for key, value in groupby(data, lambda x:x[:3]):
out[key] = sum([data[val] for val in list(value)])
print (out)
Output:
{'S01': 188, 'S02': 178}