I have got a big data frame. Below you will find an extract of it:
lst=[['31122020','A',12],['31012021','A',14],['28022021','A',15],['31032021','A',17]]
df2=pd.DataFrame(lst, columns=['Date','FN','AuM'])
I would like to calculate the Year to date (YTD) of the column AuM
. The new column should look like this:
lst=[['31122020','A',12,'NaN'],['31012021','A',14,0.167],['28022021','A',15,0.25],['31032021','A',17,0.417]]
df2=pd.DataFrame(lst, columns=['Date','FN','AuM','AuM_YTD_%Change'])
Do you know any pandas function which can reach my goal?
CodePudding user response:
You can create a mask for dates inside one year, then use diff
cumsum
for the changes, and div
for the change rates:
df2['Date'] = pd.to_datetime(df2['Date'], format='%d%m%Y')
msk = df2['Date'] < df2.loc[0, 'Date'] pd.to_timedelta(365, unit='D')
df2['AuM_YTD_%Change'] = df2.loc[msk, 'AuM'].diff().cumsum().div(df2.loc[0,'AuM'])
Output:
Date FN AuM AuM_YTD_%Change
0 2020-12-31 A 12 NaN
1 2021-01-31 A 14 0.166667
2 2021-02-28 A 15 0.250000
3 2021-03-31 A 17 0.416667