Home > Back-end >  Dask DataFrame aggregate data based on timestamp
Dask DataFrame aggregate data based on timestamp

Time:09-22

I have a Dask DataFrame with a date-time column and other numeric columns. The successive entries in the DataFrame rows differ by a fixed time interval t mins. I want to aggregate the data hourly so that the average of the rows for the other columns data is computed for every hour. How can this be done, can groupby with date-time include specifying aggregation interval?

CodePudding user response:

You probably want the resample method.

In your case

import dask

# Synthetic data
df = dask.datasets.timeseries()

# Compute the average for each hour
df.resample('H').mean().compute()
  • Related