Home > Software engineering >  Find max value and Drop row based on time python
Find max value and Drop row based on time python

Time:06-15

I am trying to find the maximum value of precip based on the station column and the hour of the valid column. I have no workable code as I've been trying this forever and have found no even close to working solution.

Here is the dataframe I am working with

station valid precip
1V4 2022-02-23 07:54:00 0.2
1V4 2022-02-23 07:55:00 0.2
1V4 2022-02-23 07:58:00 0.4
1V4 2022-02-23 08:07:00 0.1
1V4 2022-02-23 08:51:00 0.6
BTV 2022-02-23 07:52:00 0.1
BTV 2022-02-23 07:54:00 0.2
BTV 2022-02-23 07:59:00 0.3
BTV 2022-02-23 08:02:00 0.0
BTV 2022-02-23 08:16:00 0.0
BTV 2022-02-23 08:29:00 0.3

This is what I want it to look like

station valid precip
1V4 2022-02-23 07:58:00 0.4
1V4 2022-02-23 08:51:00 0.6
BTV 2022-02-23 07:59:00 0.3
BTV 2022-02-23 08:29:00 0.3

CodePudding user response:

You need to group by station and hour and get the idxmax, then slice:

df['valid'] = pd.to_datetime(df['valid'])

df.loc[df.groupby(['station', df['valid'].dt.hour])['precip'].idxmax()]

output:

   station               valid  precip
2      1V4 2022-02-23 07:58:00     0.4
4      1V4 2022-02-23 08:51:00     0.6
7      BTV 2022-02-23 07:59:00     0.3
10     BTV 2022-02-23 08:29:00     0.3

CodePudding user response:

This will allow you to group by the hour and the station

df1['Hour'] = pd.to_datetime(df['valid'], infer_datetime_format=True).dt.hour
df1['precip_max'] = df1.groupby(['station', 'Hour'])['precip'].transform('max')
df1.loc[df['precip'] == df['precip_max']]
  • Related