Home > Enterprise >  How do I show the count of just two specific values in a column while using groupby
How do I show the count of just two specific values in a column while using groupby

Time:10-16

I have a data frame like this

df = pd.DataFrame({'Year':[1991,1992 ,1993,2000,2020,2022,1980],
           'Decade':["90s","90s","90s","2_00s","2_20s","2_20s","80s"]
           'Rating': [0,0,1,3,3,2,3]})

  |Year | Decade | Rating |
  |:----|:-------|:-------|
0 |1991 | 90s    | 0      |
1 |1992 | 90s    | 0      |
2 |1993 | 90s    | 1      |
3 |2000 | 2_00s  | 3      |
4 |2020 | 2_20s  | 3      |
5 |2022 | 2_20s  | 2      |
6 |1980 | 80s    | 3      |

now i want to group them by decades and know for each decade how many 3 it has each decade, and how many 0 it has (the count of each that values in the column rating. like this:

        Rating
Decade  0       3

80_s    0       1
90_s    2       0
2_00s   0       1
2_20s   0       1

CodePudding user response:

df.groupby(['Decade', 'Rating']).size().unstack()

Result

Rating    0    1    2    3
Decade                    
2_00s   NaN  NaN  NaN  1.0
2_20s   NaN  NaN  1.0  1.0
80s     NaN  NaN  NaN  1.0
90s     2.0  1.0  NaN  NaN

And if you don't want any NaN

df.groupby(['Decade', 'Rating']).size().unstack(fill_value=0)

CodePudding user response:

You could use pd.pivot_table

pd.pivot_table(
    data=df,
    values='Rating',
    index='Decade',
    columns='Rating',
    aggfunc='count',
    fill_value=0
)

Output

Rating  0  1  2  3
Decade            
2_00s   0  0  0  1
2_20s   0  0  1  1
80s     0  0  0  1
90s     2  1  0  0

CodePudding user response:

Try:

x = (
    df.groupby(df.Year // 10)
    .agg(
        decade=("Decade", "first"), ratings=("Rating", lambda x: x.eq(3).sum())
    )
    .reset_index(drop=True)
)
print(x)

Prints:

  decade  ratings
0    80s        1
1    90s        0
2  2_00s        1
3  2_20s        1
  • Related