I have a dataframe with three columns: ID, Date, Value reporting temperature records (Value) day by day for a month (in datetime-format) from different weather stations (ID). What I need is to 'group by' every weather station and every single day and calculate the maximum temperature for each day for each weather station in a new column.
I always only get the maximum for each weather station OR each day - but needed is the max for each weather station per day.
The table looks like that:
Date | ID | Value |
---|---|---|
2022-05-12 22:09:35 00:00 | 1 | 18.3 |
2022-05-12 22:09:42 00:00 | 2 | 18.0 |
CodePudding user response:
here is one way to do it
df['Date']=pd.to_datetime(df['Date'])
df.groupby(['ID', df['Date'].dt.strftime('%Y%m%d')])['Value'].max()
ID Date
1 20220512 18.3
2 20220512 18.0
CodePudding user response:
IUUC, you can do
df['Date'] = pd.to_datetime(df['Date'])
out = df.groupby(['ID', df['Date'].dt.date])['Value'].max()
print(out)
ID Date
1 2022-05-12 18.3
2 2022-05-12 18.0
Name: Value, dtype: float64