Home > front end >  pandas value count of elements in list in column
pandas value count of elements in list in column

Time:04-16

I have a column that contains lists of varying size but a limited number of items.

print(df['channels'].value_counts(), '\n')

Output:

[web, email, mobile, social]    77733
[web, email, mobile]            43730
[email, mobile, social]         32367
[web, email]                    13751

So I want the total number of times that web, email, mobile and social each occur.

These should be:

web =    77733   43730   13751            135,214
email =  77733   43730   13751   32367    167,581
mobile = 77733   43730   32367            153,830
social = 77733   32367                    110,100

I have tried the following two methods:

sum_channels_items = pd.Series([x for item in df['channels'] for x in item]).value_counts()
print(sum_channels_items)

from itertools import chain
test = pd.Series(list(chain.from_iterable(df['channels']))).value_counts()
print(test)

Both fail with the same error (just the second one shown).

Traceback (most recent call last):
  File "C:/Users/Mark/PycharmProjects/main/main.py", line 416, in <module>
    test = pd.Series(list(chain.from_iterable(df['channels']))).value_counts()
TypeError: 'float' object is not iterable

CodePudding user response:

One option is to explode, then count values:

out = df['channels'].explode().value_counts()

Another could be to use collections.Counter. Note that your error suggests you have missing values in the column, so you could drop them first:

from itertools import chain
from collections import Counter
out = pd.Series(Counter(chain.from_iterable(df['channels'].dropna())))
  • Related