Home > Software design >  Pandas: Calculate neighbouring differences from a column in dataframe
Pandas: Calculate neighbouring differences from a column in dataframe

Time:02-26

How can I calculate the differences from neighboured numbers in a dataframe column named 'y' by only using Pandas commands?

Here is an example where I convert the column 'y'first to numpy and then use np.diff.

import numpy as np
import pandas as pd

np.random.seed(10)

df = pd.DataFrame(np.random.randint(0,10,size=(10,2)),columns=['x', 'y'])

y=df['y'].values

diff_y=np.diff(y)

print(np.array([y[0:-1],diff_y]).T)

[[ 4 -3]
 [ 1 -1]
 [ 0  8]
 [ 8 -8]
 [ 0  6]
 [ 6 -3]
 [ 3  1]
 [ 4  4]
 [ 8  0]]

CodePudding user response:

You could use diff to find the differences and shift to get the differences align (like in your output):

df['diff_y'] = df['y'].diff().shift(-1)
print(df[['y', 'diff_y']])

Output:

   y  diff_y
0  4    -3.0
1  1    -1.0
2  0     8.0
3  8    -8.0
4  0     6.0
5  6    -3.0
6  3     1.0
7  4     4.0
8  8     0.0
9  8     NaN
  • Related