The idea would be to calculate the difference between the first and second rows, and store that value in the second row (similar to .diff()
).
Then, skip the calculation between the second and third rows, and place 0
.
Then, repeat this procedure throughout all rows in the Dataframe.
For example:
A
0 100
1 101
2 103
3 107
4 110
5 120
6 150
7 170
df['B'] = df['A'].diff()
A B
0 100 Nan
1 101 1
2 103 2
3 107 4
4 110 3
5 120 10
6 150 30
7 170 20
What I would like to achieve is:
A B
0 100 0
1 101 1
2 103 0
3 107 4
4 110 0
5 120 10
6 150 0
7 170 20
Any suggestions on how to accomplish this using Pandas (or Python)?
CodePudding user response:
You can just mask
the result
df['B'] = df['A'].diff().mask(df.index%2!=1,0)
df
Out[469]:
A B
0 100 0.0
1 101 1.0
2 103 0.0
3 107 4.0
4 110 0.0
5 120 10.0
6 150 0.0
7 170 20.0
Or we do groupby
df['B'] = df.groupby(df.index//2).A.diff().fillna(0)
Out[472]:
0 0.0
1 1.0
2 0.0
3 4.0
4 0.0
5 10.0
6 0.0
7 20.0
Name: A, dtype: float64