From a panda dataframe with theses columns:
DAT_MESURE datetime64[ns]
MES_TEMPERATURE object
and values:
I want to compute the mean temperature value for a group of hours and get a new df. For example, I want to create a new df with DAT_MESURE rounded to lowest hour and a mean for 4 values of the hour.
I want to get:
DAT_MESURE MES_TEMPERATURES
2020-08-01 00:00:00 21,xx
2020-08-01 01:00:00 22,xx
How to code it in python panda please?
CodePudding user response:
Use:
df['MES_TEMPERATURE'] = df['MES_TEMPERATURE'].str.replace(',','.', regex=True).astype(float)
df1 = df.resample('H', on='DAT_MESURE')['MES_TEMPERATURE'].mean()
Or:
df2 = df.groupby(df['DAT_MESURE'].dt.floor('H'))['MES_TEMPERATURE'].mean()
If need round:
df3 = df.groupby(df['DAT_MESURE'].dt.round('H'))['MES_TEMPERATURE'].mean()