Home > Software engineering >  Pandas: how to sequentially alternate between calculating difference between two rows and skip calcu
Pandas: how to sequentially alternate between calculating difference between two rows and skip calcu

Time:05-08

The idea would be to calculate the difference between the first and second rows, and store that value in the second row (similar to .diff()).

Then, skip the calculation between the second and third rows, and place 0.

Then, repeat this procedure throughout all rows in the Dataframe.

For example:

     A
0    100
1    101
2    103
3    107
4    110
5    120
6    150    
7    170

df['B'] = df['A'].diff()

     A       B
0    100     Nan
1    101     1
2    103     2
3    107     4
4    110     3
5    120     10
6    150     30
7    170     20

What I would like to achieve is:

     A       B
0    100     0
1    101     1
2    103     0
3    107     4
4    110     0
5    120     10
6    150     0
7    170     20

Any suggestions on how to accomplish this using Pandas (or Python)?

CodePudding user response:

You can just mask the result

df['B'] = df['A'].diff().mask(df.index%2!=1,0)
df
Out[469]: 
     A     B
0  100   0.0
1  101   1.0
2  103   0.0
3  107   4.0
4  110   0.0
5  120  10.0
6  150   0.0
7  170  20.0

Or we do groupby

df['B'] = df.groupby(df.index//2).A.diff().fillna(0)
Out[472]: 
0     0.0
1     1.0
2     0.0
3     4.0
4     0.0
5    10.0
6     0.0
7    20.0
Name: A, dtype: float64
  • Related