I have a dataframe:
id life_day value
a1 1 10
a1 2 20
a1 3 10
a1 4 5
a1 5 5
a1 6 1
b2 1 7
b2 3 11
b2 4 10
b2 5 20
I want to sum values for each id till life_day
4. So desired result is:
id life_day value
a1 4 45
b2 4 28
How to do that? I tried df[df["life_day"] == 90].groupby("id).sum()
but brings wrong results
CodePudding user response:
Your approach almost works, but I don't know why you wrote == 90
in df["life_day"] == 90
, and it looks like you want the max of life_day
, not the sum.
df[df['life_day'] <= 4].groupby('id').agg({'life_day': 'max', 'value': 'sum'})
life_day value
id
a1 4 45
b2 4 28
CodePudding user response:
Use the pandas where condition to mask and then groupby agg df.where(df['life_day'].le(4)).groupby('id').agg({'life_day':'last','value':'sum'}).reset_index()
id life_day value
0 a1 4.0 45.0
1 b2 4.0 28.0