I am trying to calculate the ratio of group 1: group(>1) in the same year, my dataframe has the following format:
Group | Year | Count |
---|---|---|
1 | 2020 | 3 |
1 | 2021 | 4 |
2 | 2020 | 5 |
2 | 2021 | 6 |
3 | 2020 | 2 |
3 | 2021 | 3 |
The result should be like this:
Year | Ratio |
---|---|
2020 | 3/7 |
2021 | 4/9 |
I'm really stuck with this, can anyone help?
CodePudding user response:
You can divide DataFrame with 1
value in group
by aggregate all another groups aggregatet by sum
:
#if need equal
m = df['Group'].eq(1)
#if need greater
m = df['Group'].gt(1)
a = df[~m].set_index('Year')['Count'].div(df[m].groupby('Year')['Count'].sum())
print (a)
Year
2020 0.428571
2021 0.444444
Name: Count, dtype: float64
df1 = a.reset_index(name='Ratio')
print (df1)
Year Ratio
0 2020 0.428571
1 2021 0.444444
CodePudding user response:
You can use GroupBy.sum
and div
:
# get sum per Group Year
df2 = df.groupby(['Group', 'Year']).sum()
# divide 1st group by sum of others
out = (df2.loc[1].div(df2.drop(1).groupby(level='Year').sum())
#.reset_index() # optional, uncomment if you want Year as column
)
output:
Count
Year
2020 0.428571
2021 0.444444