Home > Mobile >  value counts problem in a column contained a list (python)
value counts problem in a column contained a list (python)

Time:11-11

value_counts can't count values in list.

Hi, Some of the columns in my dataframe have a list.

X Y
101 ['A']
200 ['A','O']
32 ['B']
41 ['A','AB,'O']
202 ['A']

When i use value_counts() ; i get this result:

['A'] 2
['A' , 'O'] 1
['B'] 1
['A','AB','O'] 1

But i want this results:

['A'] 4
['O'] 2
['B'] 1
['AB'] 1

Is there any code for it?

CodePudding user response:

I think explode will work for you

import pandas as pd
df = pd.DataFrame({"a": [['A'], ['A', 'O'], ['B'], ['A', 'AB', 'O'], ['A']]})
df["a"].explode().value_counts()

# output
A     4
O     2
B     1
AB    1

# If your dataframe is like this
from ast import literal_eval
df = pd.DataFrame({"a": ["['A']", "['A', 'O']", "['B']", "['A', 'AB', 'O']", "['A']"]})
df["a"].apply(literal_eval).explode().value_counts()

CodePudding user response:

You could collect the lists, unpack the items (using itertools.chain) and then count (using collections.Counter) and put it back into a pandas object

from itertools import chain
from collections import Counter
df['Y'] = df['Y'].apply(lambda x: eval(x)) #converts to list
pd.Series(Counter(chain.from_iterable(df['Y'].values)))

Output

A     4
O     2
B     1
AB    1
  • Related