I am tackling an issue in pandas:
I would like to group a DataFrame
by an index column, then perform a transform(np.gradient)
(i.e. compute the derivative over all values in a group). This doesn't work if my group is too small (less than 2 elements), so I would like to just return 0 in this case.
The following code returns an error:
import pandas as pd
import numpy as np
data = pd.DataFrame(
{
"time": [0,0,1,2,2,3,3],
"position": [0.1,0.2,0.2,0.1,0.2,0.1,0.2],
"speed": [150.0,145.0, 149.0,150.0,150.0,150.0,150.0],
}
)
derivative = data.groupby("time").transform(np.gradient)
Gives me a ValueError:
ValueError: Shape of array too small to calculate a numerical gradient, at least (edge_order 1) elements are required.
The desired output for the example DataFrame above would be
time position_km
0 0.1 -5.0
0.2 -5.0
1 0.2 0.0
2 0.1 0.0
0.2 0.0
3 0.1 0.0
0.2 0.0
Does anyone have a good idea on how to solve this, e.g. using a lambda function in the transform
?
CodePudding user response:
derivative = data.groupby("time").transform(lambda x: np.gradient(x) if len(x) > 1 else 0)
does exactly what I wanted. Thanks @Chrysophylaxs
CodePudding user response:
Possible option:
def gradient_group(group):
if group.shape[0] < 2:
return 0
return np.gradient(group)
df['derivative'] = df.groupby(df.index).apply(gradient_group)