I have a Dask DataFrame with a date-time column and other numeric columns. The successive entries in the DataFrame rows differ by a fixed time interval t
mins. I want to aggregate the data hourly so that the average of the rows for the other columns data is computed for every hour. How can this be done, can groupby
with date-time include specifying aggregation interval?
CodePudding user response:
You probably want the resample
method.
In your case
import dask
# Synthetic data
df = dask.datasets.timeseries()
# Compute the average for each hour
df.resample('H').mean().compute()