I am trying to handle the following dataframe
df = pd.DataFrame({'ID':[1,1,2,2,3,3,3,4,4,4,4],
'sum':[1,2,1,2,1,2,3,1,2,3,4,]})
Now I want to find the difference from the last row by each ID.
Specifically, I tried this code.
df['diff'] = df.groupby('ID')['sum'].diff(-1)
df
However, this would require a difference from one line behind.
Is there any way to determine the difference between each of the last rows with groupbuy?
Thank you for your help.
CodePudding user response:
You can use transform('last')
to get the last value per group:
df['diff'] = df['sum'].sub(df.groupby('ID')['sum'].transform('last'))
or using groupby.apply
:
df['diff'] = df.groupby('ID')['sum'].apply(lambda x: x-x.iloc[-1])
output:
ID sum diff
0 1 1 -1
1 1 2 0
2 2 1 -1
3 2 2 0
4 3 1 -2
5 3 2 -1
6 3 3 0
7 4 1 -3
8 4 2 -2
9 4 3 -1
10 4 4 0