Home > Back-end >  How to make a custom sum in pandas?
How to make a custom sum in pandas?

Time:03-21

I'm trying to create a running sum on a Dataframe that follows some conditions, sum all previous values with same 'first index current day' value and condiction is true, the pseudocode would be something like

df_ = df[first index current day:current index] #Delimiter dataframe range from begin of day until current row
df_ = df_[df_['toSum'] == True] #drop False
result = sum(df_['sales']) #Sum all sales until now with 'toSum' == True

This is very easy using a loop or even a apply but take too much time, my dataframe have more than 1000k rows, the best I could do until now is use threads but if it's possible to vectorize that algorithm it would be a lot fastter

The result is something like this:

Actual Id First ID Actual Day Sales toSum AcumulativeDay
0 0 1 True 1
1 0 1 True 2
2 0 1 False 2
3 0 1 False 2
4 0 1 True 3
5 5 1 True 1
6 5 1 True 2
7 5 1 False 2
8 5 1 False 2
9 5 1 True 3

CodePudding user response:

After having the table filtered as you did use:

df['cumsum'] = df['sales'].cumsum()

You may want to have the dataframe sorted by date.

CodePudding user response:

Use groupby over First ID Actual Day to get all previous values with same first day ID. Then, multiply the boolean column toSum with Sales so that cumsum can interpret the False values as zeros. Remove the indices with droplevel and make the attribution to the variable.

import pandas as pd

d= {'Actual Id': {0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9},
 'First ID Actual Day': {0: 0,1: 0,2: 0,3: 0,4: 0,5: 5,6: 5,7: 5,8: 5,9: 5},
 'Sales': {0: 1.0,1: 11.33,2: 1.0,3: 31.2,4: 1.0,5: 241.14,6: 1.0,7: 1.0,8: 1.0,9: 54.22},
 'toSum': {0: True,1: True,2: False,3: False,4: True,5: True,6: True,7: False,8: False,9: True}
}
df = pd.DataFrame(d)

df['AcumulativeDay'] = df.groupby(['First ID Actual Day'])\
                        .apply(lambda x: (x['Sales']*x['toSum'])\
                        .cumsum()).droplevel(0)
   Actual Id  First ID Actual Day   Sales  toSum  AcumulativeDay
0          0                    0    1.00   True            1.00
1          1                    0   11.33   True           12.33
2          2                    0    1.00  False           12.33
3          3                    0   31.20  False           12.33
4          4                    0    1.00   True           13.33
5          5                    5  241.14   True          241.14
6          6                    5    1.00   True          242.14
7          7                    5    1.00  False          242.14
8          8                    5    1.00  False          242.14
9          9                    5   54.22   True          296.36
  • Related