Home > front end >  Calculate difference between each pair of rows and jump each 2
Calculate difference between each pair of rows and jump each 2

Time:11-12

I need to calculate the difference between each consecutive pairs, but all current solutions such as rolling, diff, will not jump.

To explain, I need to get output column as such

  a output
0 5 -3
1 8 -2
2 2 nan
3 4 nan

so (5-8) and (2-4) are my results.

I tried this which doesn't "jump":

df['output'] = df['C'] - df['C'].shift(-1)

CodePudding user response:

I would use numpy for that:

N = 2
df.loc[df.index[:N],
       'output'] = -np.diff(df['a'].to_numpy().reshape(N, -1, order='F')))

With pandas:

N = 2
df['output'] = df['a'].iloc[:N].rsub(df['a'].iloc[N:].values)

output:

   a  output
0  5    -3.0
1  8    -2.0
2  2     NaN
3  4     NaN

Other example with N=3:

   a  output
0  5    -3.0
1  8    -4.0
2  7    -6.0
3  2     NaN
4  4     NaN
5  1     NaN

CodePudding user response:

Another way the groupby function is used:

df['output']=df.groupby(df.index // 2).diff(-1)
df['output'].iloc[1:]=df['output'].iloc[1:].shift(-1)
df
'''
    a   output
0   5   -3.0
1   8   -2.0
2   2   nan
3   4   nan

'''

This is how it works with different data.I'm not sure exactly what you want. If this is wrong, please state it as a comment. I will delete it.

df=pd.DataFrame(data={'a':[5,8,2,4,10,20,10,232,323]})
df['output']=df.groupby(df.index // 2).diff(-1)
df['output'].iloc[1:]=df['output'].iloc[1:].shift(-1)
df
'''
    a   output
0   5   -3.0
1   8   -2.0
2   2   nan
3   4   -10.0
4   10  nan
5   20  -222.0
6   10  nan
7   232 nan
8   323 nan

'''
  • Related