Home > Back-end >  How to find the last line and the diff of each line
How to find the last line and the diff of each line

Time:07-27

I am trying to handle the following dataframe

df = pd.DataFrame({'ID':[1,1,2,2,3,3,3,4,4,4,4],
                   'sum':[1,2,1,2,1,2,3,1,2,3,4,]})

Now I want to find the difference from the last row by each ID.

Specifically, I tried this code.

df['diff'] = df.groupby('ID')['sum'].diff(-1)
df

enter image description here

However, this would require a difference from one line behind.

Is there any way to determine the difference between each of the last rows with groupbuy?

Thank you for your help.

CodePudding user response:

You can use transform('last') to get the last value per group:

df['diff'] = df['sum'].sub(df.groupby('ID')['sum'].transform('last'))

or using groupby.apply:

df['diff'] = df.groupby('ID')['sum'].apply(lambda x: x-x.iloc[-1])

output:

    ID  sum  diff
0    1    1    -1
1    1    2     0
2    2    1    -1
3    2    2     0
4    3    1    -2
5    3    2    -1
6    3    3     0
7    4    1    -3
8    4    2    -2
9    4    3    -1
10   4    4     0
  • Related