Home > front end >  Count frequency of values in dictionary when values is a list
Count frequency of values in dictionary when values is a list

Time:09-02

I have a dictionary, created from two columns in a dataframe using:

df_dict=df.groupby('column1')['column2'].agg(list).to_dict()

I then used Counter to count how many items were put into each key as values:

df_dict_counts = Counter(df_dict)

This gives me per key, how many items are present as the key's values, which is great.

But now, I want to count the frequency of the items per key and to print the count.

So my dictionary looks like this:

df_dict = 
{
 'Apples': ['big', 'medium', 'medium', 'medium' 'big','small'],
 'Oranges': ['big', 'medium', 'big'],
 'Bananas':['small', ''small', 'small','small', 'big'],
 'Pineapples':['small', 'big', 'big','big']
}

and the output I am aiming for is something like this:

df_dict_counts = 
{
 'Apples': {'big':2, 'medium':3, 'small':1},
 'Oranges': {'big':2, 'medium':1},
 'Bananas': {'small':4, 'big':1},
 'Pineapples': {'small':1, 'big':3}
}

if you could help me to then print the 'df_dict_counts' into a .csv file, it would be great!

Thanks!!

CodePudding user response:

You can use the Conter function to convert the list to the frequency map

import collections
    
final = {}
for k,v in df_dict.items():
    final[k] = dict(collections.Counter(v))
print(final)

CodePudding user response:

You used the correct collection and counter logic. But instead of using Counter on the df_dict, you need to use it on the values of keys in df_dict Try this :

import collections

df_dict = 
{
 'Apples': ['big', 'medium', 'medium', 'medium' 'big','small'],
 'Oranges': ['big', 'medium', 'big'],
 'Bananas':['small', ''small', 'small','small', 'big'],
 'Pineapples':['small', 'big', 'big','big']
}

count_dict = {}
for key,val in df_dict.items():
    count_dict[key] = dict(collections.Counter(val))
print(count_dict)

CodePudding user response:

You can do this by using counter and comprehension. dict.items() will give you the keys (Apples, Oranges...) and values (dictionary) that you need to count.

  from collections import Counter
  df_new = {k: dict(Counter(v)) for k, v in df_dict.items()}

Result

{'Apples': {'big': 2, 'medium': 3, 'small': 1},
 'Oranges': {'big': 2, 'medium': 1},
 'Bananas': {'small': 4, 'big': 1},
 'Pineapples': {'small': 1, 'big': 3}}

to save this result into a file (csv):

import json
with open('file.csv', 'w') as convert_file:
     convert_file.write(json.dumps(df_new))
  • Related