Count frequency of values in dictionary when values is a list-CodePudding

I have a dictionary, created from two columns in a dataframe using:

df_dict=df.groupby('column1')['column2'].agg(list).to_dict()

I then used Counter to count how many items were put into each key as values:

df_dict_counts = Counter(df_dict)

This gives me per key, how many items are present as the key's values, which is great.

But now, I want to count the frequency of the items per key and to print the count.

So my dictionary looks like this:

df_dict = 
{
 'Apples': ['big', 'medium', 'medium', 'medium' 'big','small'],
 'Oranges': ['big', 'medium', 'big'],
 'Bananas':['small', ''small', 'small','small', 'big'],
 'Pineapples':['small', 'big', 'big','big']
}

and the output I am aiming for is something like this:

df_dict_counts = 
{
 'Apples': {'big':2, 'medium':3, 'small':1},
 'Oranges': {'big':2, 'medium':1},
 'Bananas': {'small':4, 'big':1},
 'Pineapples': {'small':1, 'big':3}
}

if you could help me to then print the 'df_dict_counts' into a .csv file, it would be great!

Thanks!!

CodePudding user response：

You can use the Conter function to convert the list to the frequency map

import collections
    
final = {}
for k,v in df_dict.items():
    final[k] = dict(collections.Counter(v))
print(final)

CodePudding user response：

You used the correct collection and counter logic. But instead of using Counter on the df_dict, you need to use it on the values of keys in df_dict Try this :

import collections

df_dict = 
{
 'Apples': ['big', 'medium', 'medium', 'medium' 'big','small'],
 'Oranges': ['big', 'medium', 'big'],
 'Bananas':['small', ''small', 'small','small', 'big'],
 'Pineapples':['small', 'big', 'big','big']
}

count_dict = {}
for key,val in df_dict.items():
    count_dict[key] = dict(collections.Counter(val))
print(count_dict)

CodePudding user response：

You can do this by using counter and comprehension. dict.items() will give you the keys (Apples, Oranges...) and values (dictionary) that you need to count.

  from collections import Counter
  df_new = {k: dict(Counter(v)) for k, v in df_dict.items()}

Result

{'Apples': {'big': 2, 'medium': 3, 'small': 1},
 'Oranges': {'big': 2, 'medium': 1},
 'Bananas': {'small': 4, 'big': 1},
 'Pineapples': {'small': 1, 'big': 3}}

to save this result into a file (csv):

import json
with open('file.csv', 'w') as convert_file:
     convert_file.write(json.dumps(df_new))