Home > Enterprise >  How to sum values in dataframe until certain values in other column by group?
How to sum values in dataframe until certain values in other column by group?

Time:11-27

I have a dataframe:

id   life_day   value
a1     1         10
a1     2         20
a1     3         10
a1     4         5
a1     5         5
a1     6         1
b2     1         7
b2     3         11
b2     4         10
b2     5         20

I want to sum values for each id till life_day 4. So desired result is:

id    life_day   value
a1       4         45
b2       4         28

How to do that? I tried df[df["life_day"] == 90].groupby("id).sum() but brings wrong results

CodePudding user response:

Your approach almost works, but I don't know why you wrote == 90 in df["life_day"] == 90, and it looks like you want the max of life_day, not the sum.

df[df['life_day'] <= 4].groupby('id').agg({'life_day': 'max', 'value': 'sum'})
    life_day  value
id                 
a1         4     45
b2         4     28

CodePudding user response:

Use the pandas where condition to mask and then groupby agg df.where(df['life_day'].le(4)).groupby('id').agg({'life_day':'last','value':'sum'}).reset_index()

   id  life_day  value
0  a1       4.0   45.0
1  b2       4.0   28.0
  • Related