Pandas: How to get the count of each value in a column with groupby option-CodePudding

This is the dataframe I've got:

data = {'Year' : [2021, 2021, 2021, 2022, 2022, 2022], 
        'Class':['A', 'A', 'B', 'A', 'C', 'C'], 
        'Animal':['dog|cat|bird', 'cat|dog', 'tiger|dog', 'cat|bird', 'dog|cat|rabbit', 'rabbit|dog|tiger',]}
df = pd.DataFrame(data)

So the df looks like:

Year	Class	Animal
2021	A	dog\|cat\|bird
2021	A	cat\|dog
2021	B	tiger\|dog
2022	A	cat\|bird
2022	C	dog\|cat\|rabbit
2022	C	rabbit\|dog\|tiger

What I'd like to do is to calculate the number of each animal in each year and class. For example, I want to get the following dataframe:

Year	Class	Animal	Count
2021	A	dog	2
2021	A	cat	2
2021	A	bird	1
2021	B	tiger	1
2021	B	dog	1
2022	A	cat	1
2022	A	bird	1
2022	C	dog	2
2022	C	cat	1
2022	C	rabbit	2
2022	C	tiger	1

Does anyone have any suggestions about achieving this? I'd be really appreciate it.

CodePudding user response：

You can do this with a one-liner:

(df.assign(Animal=df['Animal'].str.split('|'))     # Create a list using split
  .explode('Animal')                               # Expand that list it rows using explode
  .value_counts(sort=False)                        # Use pd.DataFrame.value_counts 
  .rename('Count')                                 # Rename series
  .reset_index())                                  # Reset to a dataframe

Output:

    Year Class  Animal  Count
0   2021     A    bird      1
1   2021     A     cat      2
2   2021     A     dog      2
3   2021     B     dog      1
4   2021     B   tiger      1
5   2022     A    bird      1
6   2022     A     cat      1
7   2022     C     cat      1
8   2022     C     dog      2
9   2022     C  rabbit      2
10  2022     C   tiger      1

CodePudding user response：

Let us try str.get_dummies then groupby

out = (df.Animal.str.get_dummies('|')
       .groupby([df['Year'],df['Class']]).sum()
       .mask(lambda x : x==0)
       .rename_axis(['animal'],axis=1).stack().reset_index(name='Count')
Out[666]: 
    Year Class  animal  Count
0   2021     A    bird    1.0
1   2021     A     cat    2.0
2   2021     A     dog    2.0
3   2021     B     dog    1.0
4   2021     B   tiger    1.0
5   2022     A    bird    1.0
6   2022     A     cat    1.0
7   2022     C     cat    1.0
8   2022     C     dog    2.0
9   2022     C  rabbit    2.0
10  2022     C   tiger    1.0