Home > Blockchain >  groupby max value of each year in initial pandas dataframe
groupby max value of each year in initial pandas dataframe

Time:10-07

I have the following dataframe:

date  = ['2015-02-03 21:00:00','2015-02-03 22:30:00','2016-02-03 21:00:00','2016-02-03   22:00:00']
value_column = [33.24  , 500  , 34.39  , 34.49 ]
        
df = pd.DataFrame({'V1':value_column}, index=pd.to_datetime(date))
print(df.head())
    
                           V1 
     index                                     
     2015-02-03 21:00:00  33.24   
     2015-02-03 22:30:00  500   
     2016-02-03 21:00:00  34.39   
     2016-02-03 22:00:00  34.49   

I am trying to create a new column in that dataframe that contains in each row the max value of column V1 for each year.

I know how to extract the maximum value of column V1 for each year:

df['V1'].groupby(df.index.year).max()

But do not know how to assign efficienctly the values in my original dataframe. Any idea on how to do that efficiently? Expected result:

                        V1     max V1
 index                                     
 2015-02-03 21:00:00  33.24    500 
 2015-02-03 22:30:00  500      500
 2016-02-03 21:00:00  34.39    34.49
 2016-02-03 22:00:00  34.49    34.49

Many thanks for your help!

CodePudding user response:

You can use .transform("max"):

df["max V1"] = df["V1"].groupby(df.index.year).transform("max")
print(df)

Prints:

                         V1  max V1
2015-02-03 21:00:00   33.24  500.00
2015-02-03 22:30:00  500.00  500.00
2016-02-03 21:00:00   34.39   34.49
2016-02-03 22:00:00   34.49   34.49
  • Related