I'm trying to calculate the difference between each row but I have an issue that I don't know how to solve. Below is my sample code:
import pandas as pd
import numpy as np
df = pd.DataFrame({
'Place': ['Hanoi','Hanoi','Hanoi','Hanoi','Hanoi','Hochiminh','Hochiminh','Hochiminh','Hochiminh','Hochiminh'],
'Date': ['2022-04-01','2022-04-02','2022-04-03','2022-04-04','2022-04-05','2022-04-01','2022-04-02','2022-04-03','2022-04-04','2022-04-05'],
'Number': [0,2,4,6,8,12,17,20,26,28]})
And it's the Output:
Place Date Number
0 Hanoi 2022-04-01 0
1 Hanoi 2022-04-02 2
2 Hanoi 2022-04-03 4
3 Hanoi 2022-04-04 6
4 Hanoi 2022-04-05 8
5 Hochiminh 2022-04-01 12
6 Hochiminh 2022-04-02 17
7 Hochiminh 2022-04-03 20
8 Hochiminh 2022-04-04 26
9 Hochiminh 2022-04-05 28
And then I used diff
to calculate different number:
df['diff'] = df['Number'].diff()
and it's Output:
Place Date Number diff
0 Hanoi 2022-04-01 0 NaN
1 Hanoi 2022-04-02 2 2.0
2 Hanoi 2022-04-03 4 2.0
3 Hanoi 2022-04-04 6 2.0
4 Hanoi 2022-04-05 8 2.0
5 Hochiminh 2022-04-01 12 4.0
6 Hochiminh 2022-04-02 17 5.0
7 Hochiminh 2022-04-03 20 3.0
8 Hochiminh 2022-04-04 26 6.0
9 Hochiminh 2022-04-05 28 2.0
As you see row 5
of column diff
is calculating by subtract row 5
of column number and row 4
of column number but it's not what I want.
I want row 5 will be reset to NaN
or 0
because I want to calculate different number for each place.
This is my expected Output
Place Date Number diff
0 Hanoi 2022-04-01 0 NaN
1 Hanoi 2022-04-02 2 2.0
2 Hanoi 2022-04-03 4 2.0
3 Hanoi 2022-04-04 6 2.0
4 Hanoi 2022-04-05 8 2.0
5 Hochiminh 2022-04-01 12 0.0
6 Hochiminh 2022-04-02 17 5.0
7 Hochiminh 2022-04-03 20 3.0
8 Hochiminh 2022-04-04 26 6.0
9 Hochiminh 2022-04-05 28 2.0
CodePudding user response:
df['diff'] = df.groupby('Place')['Number'].diff().fillna(0)
df
Place Date Number diff
0 Hanoi 2022-04-01 0 0.0
1 Hanoi 2022-04-02 2 2.0
2 Hanoi 2022-04-03 4 2.0
3 Hanoi 2022-04-04 6 2.0
4 Hanoi 2022-04-05 8 2.0
5 Hochiminh 2022-04-01 12 0.0
6 Hochiminh 2022-04-02 17 5.0
7 Hochiminh 2022-04-03 20 3.0
8 Hochiminh 2022-04-04 26 6.0
9 Hochiminh 2022-04-05 28 2.0