How to compute average every 5 seconds based on time column in python-CodePudding

I have a dataset like below, Its time col is based on milliseconds.

pid_col ,timestamp_col ,value_col
31,2019-03-29 07:14:56.999999756,0.0
31,2019-03-29 07:14:57.250000,0.614595
31,2019-03-29 07:14:57.500000,0.678615
31,2019-03-29 07:14:57.750000,0.687578
31,2019-03-29 07:14:58.000000244,0.559804
31,2019-03-29 07:14:58.250000,0.522672
31,2019-03-29 07:14:58.499999512,0.51627
31,2019-03-29 07:14:58.750000,0.51627
31,2019-03-29 07:14:59.000000244,0.517551
31,2019-03-29 07:14:59.250000,0.51627
31,2019-03-29 07:14:59.500000244,0.509868
31,2019-03-29 07:14:59.750000488,0.513709
31,2019-03-29 07:15:00,0.513709
31,2019-03-29 07:15:00.249999512,0.518831
31,2019-03-29 07:15:00.500000,0.531635

How could I compute the average of values in every 5 seconds?? I've used resample but it didn't work. here is my code:

col_list = ["timestamp", "pid","value"]
df = read_csv("data.csv", usecols=col_list)
df['timestamp'] = to_datetime(df['timestamp'], unit='ms')
timestamp_col=df['timestamp'].tolist()
pid_col=df['pid'].tolist()
value_col=df['value'].tolist()

df['timestamp'].resample('5S').mean()
timestamp_col=df['timestamp'].tolist()

Thanks for your helps

CodePudding user response：

Define timestamp_col as index before resample:

>>> df.set_index('timestamp').groupby('pid') \
      .resample('5S')['value'].mean().reset_index()

   pid           timestamp     value
0   31 2019-03-29 07:14:55  0.512767
1   31 2019-03-29 07:15:00  0.521392

Update

Maybe you prefer this version:

>>> df.groupby(['pid', pd.Grouper(freq='5S', key='timestamp')], as_index=False) \
      .agg({'pid': 'first', 'timestamp': 'first', 'value': 'mean'})

   pid                     timestamp     value
0   31 2019-03-29 07:14:56.999999756  0.512767
1   31 2019-03-29 07:15:00.000000000  0.521392