Pandas - Group by Cumulative Calculation-CodePudding

I'm trying to do a calculation with cumulative operations using pandas in jupyter notebook.

But I'm stuck.

I have the following DataFrame:

Date	Type	deposit	withdrawal
2020/01/19	A	1000	0
2020/01/23	A	500	0
2020/02/05	A	0	200
2020/02/22	A	1000	200
2020/01/19	B	859.8	0
2020/02/04	B	514.33	50
2020/02/25	B	1463.14	0

I need to apply this formula, grouping by type:

Movement = IF(ISNULL([previous_date]); [Current_Deposit] - [withdrawal];([previous_movement] * (1  ([current_date]-[previous_date]) * 0,0001))   [Current_Deposit] - [withdrawal])

Example:

Date	Type	deposit	withdrawal	Movement	Formula
2020/01/19	A	1000	0	1000	IF(ISNULL([NULL]); [1000] - [0];([0] * (1 ([2020/01/19]-[NULL]) * 0,0001)) [1000] - [0])
2020/01/23	A	500	0	1500.40	IF(ISNULL([2020/01/19]); [500] - [0];([1000] * (1 ([2020/01/23]-[2020/01/19]) * 0,0001)) [500] - [0])
2020/02/05	A	0	200	1302.35	IF(ISNULL([2020/01/23]); [0] - [200];([1500.40] * (1 ([2020/02/05]-[2020/01/23]) * 0,0001)) [0] - [200])
2020/02/22	A	1000	200	2104,56	IF(ISNULL([2020/02/05]); [1000] - [200];([1302.35] * (1 ([2020/02/22]-[2020/02/05]) * 0,0001)) [1000] - [200])
2020/01/19	B	859.8	0	859.80	IF(ISNULL([NULL]); [859.8] - [0];([0] * (1 ([2020/01/19]-[NULL]) * 0,0001)) [859.8] - [0])
2020/02/04	B	514.33	50	1325.51	IF(ISNULL([2020/01/19]); [514.33] - [50];([859.80] * (1 ([2020/02/04]-[2020/01/19]) * 0,0001)) [514.33] - [50])
2020/02/25	B	1463.14	0	2791.43	IF(ISNULL([2020/02/04]); [1463.14] - [0];([1325.51] * (1 ([2020/02/25]-[2020/02/04]) * 0,0001)) [1463.14] - [0])

I'm not able to accumulate the values to put in the formula.

I need to group the calculations by type, and accumulate the values to use in the current line. I'm new to python and I'm not able to implement it.

CodePudding user response：

I assume that your dataframe is sorted by Type and Date

# silencing chained assignment warning
pd.options.mode.chained_assignment = None

df['Movement'] = 0

df_new = df
for row in df.iterrows():
    x = row[1]
    y = df_new.shift().loc[x.name]
    if x['Type'] == y['Type']:
        # computing the given days compound interest should calculate using power
        compound_interest = pow(1.0001, pd.Timedelta(pd.to_datetime(x['Date']) - pd.to_datetime(y['Date'])).days)
        # if you wish to use your formula uncomment the following line
        # compound_interest = 1   0.0001 * pd.Timedelta(pd.to_datetime(x['Date']) - pd.to_datetime(y['Date'])).days
        df_new['Movement'].loc[x.name] = compound_interest * y['Movement']   x['deposit'] - x['withdrawal']
    else:
        df_new['Movement'].loc[x.name] = x['deposit']

print(df_new)

OUTPUT

      Date Type  deposit  withdrawal    Movement
2020/01/19    A  1000.00           0 1000.000000
2020/01/23    A   500.00           0 1500.400060
2020/02/05    A     0.00         200 1302.351751
2020/02/22    A  1000.00         200 2104.567521
2020/01/19    B   859.80           0  859.800000
2020/02/04    B   514.33          50 1325.506712
2020/02/25    B  1463.14           0 2791.433062