I have a data frame like this
df = pd.DataFrame({'Year':[1991,1992 ,1993,2000,2020,2022,1980],
'Decade':["90s","90s","90s","2_00s","2_20s","2_20s","80s"]
'Rating': [0,0,1,3,3,2,3]})
|Year | Decade | Rating |
|:----|:-------|:-------|
0 |1991 | 90s | 0 |
1 |1992 | 90s | 0 |
2 |1993 | 90s | 1 |
3 |2000 | 2_00s | 3 |
4 |2020 | 2_20s | 3 |
5 |2022 | 2_20s | 2 |
6 |1980 | 80s | 3 |
now i want to group them by decades and know for each decade how many 3 it has each decade, and how many 0 it has (the count of each that values in the column rating. like this:
Rating
Decade 0 3
80_s 0 1
90_s 2 0
2_00s 0 1
2_20s 0 1
CodePudding user response:
df.groupby(['Decade', 'Rating']).size().unstack()
Result
Rating 0 1 2 3
Decade
2_00s NaN NaN NaN 1.0
2_20s NaN NaN 1.0 1.0
80s NaN NaN NaN 1.0
90s 2.0 1.0 NaN NaN
And if you don't want any NaN
df.groupby(['Decade', 'Rating']).size().unstack(fill_value=0)
CodePudding user response:
You could use pd.pivot_table
pd.pivot_table(
data=df,
values='Rating',
index='Decade',
columns='Rating',
aggfunc='count',
fill_value=0
)
Output
Rating 0 1 2 3
Decade
2_00s 0 0 0 1
2_20s 0 0 1 1
80s 0 0 0 1
90s 2 1 0 0
CodePudding user response:
Try:
x = (
df.groupby(df.Year // 10)
.agg(
decade=("Decade", "first"), ratings=("Rating", lambda x: x.eq(3).sum())
)
.reset_index(drop=True)
)
print(x)
Prints:
decade ratings
0 80s 1
1 90s 0
2 2_00s 1
3 2_20s 1