Home > Blockchain >  Cumulative Value summed up with first entry of a column with groupby
Cumulative Value summed up with first entry of a column with groupby

Time:04-05

I have got the following dataframe:

lst=[['01012021','A',100,'NaN'],['01012021','B',120,'NaN'],['01022021','A',140,5],['01022021','B',160,12],['01032021','A',180,20],['01032021','B',200,25]]
df1=pd.DataFrame(lst,columns=['Date','FN','AuM','NNA'])

I would like to generate a new column (AuM_2) which takes the starting value in column AuM and accumulates the entries in column NNA by the column Product. The end result should look like the below mentioned dataframe:

lst=[['01012021','A',100,'NaN','NaN'],['01012021','B',120,'NaN','NaN'],['01022021','A',140,5,105],['01022021','B',160,12,132],['01032021','A',180,20,125],['01032021','B',200,25,157]]
df2=pd.DataFrame(lst,columns=['Date','FN','AuM','NNA','AuM_2'])

Do you have any idea how I could generate column AuM_2 with groupby? Thank you for your assistance.

CodePudding user response:

Sum values from GroupBy.cumsum and first values per groups by GroupBy.transform:

df1['NNA'] = pd.to_numeric(df1['NNA'], errors='coerce')
df1['AuM'] = pd.to_numeric(df1['AuM'], errors='coerce')

g = df1.groupby('FN')
df1['AuM_2'] = g['NNA'].cumsum()   g['AuM'].transform('first')
print (df1)
       Date FN  AuM   NNA  AuM_2
0  01012021  A  100   NaN    NaN
1  01012021  B  120   NaN    NaN
2  01022021  A  140   5.0  105.0
3  01022021  B  160  12.0  132.0
4  01032021  A  180  20.0  125.0
5  01032021  B  200  25.0  157.0
  • Related