I have a hourly dataframe, df
, and i need to create a new dataframe with the min
, mean
and max
values from each day. Here's what i tried to do:
df = pd.DataFrame(np.random.rand(72, 1),
columns=["Random"],
index=pd.date_range(start="20220101000000", end="20220103230000", freq='H'))
df_min = df.resample('D').min()
df_mean = df.resample('D').mean()
df_max = df.resample('D').max()
But not really sure how could group those three dataframes (df_min
, df_mean
and df_max
) into a new single dataframe, df_new
. This new dataframe should have only one column, something like this:
2022-01-01T00:00:00.00 0.002 <- min
2022-01-01T00:00:00.00 0.023 <- mean
2022-01-01T00:00:00.00 0.965 <- max
2022-01-02T00:00:00.00 0.013 <- min
2022-01-02T00:00:00.00 0.053 <- mean
2022-01-02T00:00:00.00 0.825 <- max
2022-01-03T00:00:00.00 0.011 <- min
2022-01-03T00:00:00.00 0.172 <- mean
2022-01-03T00:00:00.00 0.992 <- max
CodePudding user response:
Use Resample.agg
with list of functions, then reshape by DataFrame.stack
and remove second level of MultiIndex
by Series.droplevel
:
s = df.resample('D')['Random'].agg(['min','mean','max']).stack().droplevel(1)
print (s)
2022-01-01 0.162976
2022-01-01 0.574074
2022-01-01 0.980742
2022-01-02 0.012299
2022-01-02 0.467338
2022-01-02 0.962570
2022-01-03 0.000722
2022-01-03 0.426793
2022-01-03 0.947014
dtype: float64