Home > Software design >  Pandas group by with conditional transform()
Pandas group by with conditional transform()

Time:01-03

I am tackling an issue in pandas:

I would like to group a DataFrame by an index column, then perform a transform(np.gradient) (i.e. compute the derivative over all values in a group). This doesn't work if my group is too small (less than 2 elements), so I would like to just return 0 in this case.

The following code returns an error:

import pandas as pd
import numpy as np


data = pd.DataFrame(
        {
            "time": [0,0,1,2,2,3,3],
            "position": [0.1,0.2,0.2,0.1,0.2,0.1,0.2],
            "speed": [150.0,145.0, 149.0,150.0,150.0,150.0,150.0],
        }
    )

derivative = data.groupby("time").transform(np.gradient)

Gives me a ValueError:

ValueError: Shape of array too small to calculate a numerical gradient, at least (edge_order   1) elements are required.

The desired output for the example DataFrame above would be

time position_km                
0    0.1                    -5.0
     0.2                    -5.0
1    0.2                     0.0
2    0.1                     0.0
     0.2                     0.0
3    0.1                     0.0
     0.2                     0.0

Does anyone have a good idea on how to solve this, e.g. using a lambda function in the transform?

CodePudding user response:

derivative = data.groupby("time").transform(lambda x: np.gradient(x) if len(x) > 1 else 0)

does exactly what I wanted. Thanks @Chrysophylaxs

CodePudding user response:

Possible option:

def gradient_group(group):
  if group.shape[0] < 2:
    return 0
  return np.gradient(group)

df['derivative'] = df.groupby(df.index).apply(gradient_group)
  • Related