Home > Enterprise >  Resample hourly to daily and group by min, max and mean values
Resample hourly to daily and group by min, max and mean values

Time:11-30

I have a hourly dataframe, df, and i need to create a new dataframe with the min, mean and max values from each day. Here's what i tried to do:

df = pd.DataFrame(np.random.rand(72, 1),
                    columns=["Random"],
                    index=pd.date_range(start="20220101000000", end="20220103230000", freq='H'))

df_min = df.resample('D').min()
df_mean = df.resample('D').mean()
df_max = df.resample('D').max()

But not really sure how could group those three dataframes (df_min, df_mean and df_max) into a new single dataframe, df_new. This new dataframe should have only one column, something like this:

2022-01-01T00:00:00.00 0.002  <- min
2022-01-01T00:00:00.00 0.023  <- mean
2022-01-01T00:00:00.00 0.965  <- max
2022-01-02T00:00:00.00 0.013  <- min
2022-01-02T00:00:00.00 0.053  <- mean
2022-01-02T00:00:00.00 0.825  <- max
2022-01-03T00:00:00.00 0.011  <- min
2022-01-03T00:00:00.00 0.172  <- mean
2022-01-03T00:00:00.00 0.992  <- max

CodePudding user response:

Use Resample.agg with list of functions, then reshape by DataFrame.stack and remove second level of MultiIndex by Series.droplevel:

s = df.resample('D')['Random'].agg(['min','mean','max']).stack().droplevel(1)
print (s)
2022-01-01    0.162976
2022-01-01    0.574074
2022-01-01    0.980742
2022-01-02    0.012299
2022-01-02    0.467338
2022-01-02    0.962570
2022-01-03    0.000722
2022-01-03    0.426793
2022-01-03    0.947014
dtype: float64
  • Related