I have the following dataframe:
date = ['2015-02-03 21:00:00','2015-02-03 22:30:00','2016-02-03 21:00:00','2016-02-03 22:00:00']
value_column = [33.24 , 500 , 34.39 , 34.49 ]
df = pd.DataFrame({'V1':value_column}, index=pd.to_datetime(date))
print(df.head())
V1
index
2015-02-03 21:00:00 33.24
2015-02-03 22:30:00 500
2016-02-03 21:00:00 34.39
2016-02-03 22:00:00 34.49
I am trying to create a new column in that dataframe that contains in each row the max value of column V1 for each year.
I know how to extract the maximum value of column V1 for each year:
df['V1'].groupby(df.index.year).max()
But do not know how to assign efficienctly the values in my original dataframe. Any idea on how to do that efficiently? Expected result:
V1 max V1
index
2015-02-03 21:00:00 33.24 500
2015-02-03 22:30:00 500 500
2016-02-03 21:00:00 34.39 34.49
2016-02-03 22:00:00 34.49 34.49
Many thanks for your help!
CodePudding user response:
You can use .transform("max")
:
df["max V1"] = df["V1"].groupby(df.index.year).transform("max")
print(df)
Prints:
V1 max V1
2015-02-03 21:00:00 33.24 500.00
2015-02-03 22:30:00 500.00 500.00
2016-02-03 21:00:00 34.39 34.49
2016-02-03 22:00:00 34.49 34.49