Home > OS >  How to calculate running total per customer for previous 365 days in pandas
How to calculate running total per customer for previous 365 days in pandas

Time:07-30

I am trying to calculate a running total per customer for the previous 365 days using pandas but my code isn't working. My intended output would be something like this:

date customer daily_total_per_customer rolling_total
2016-07-29 1 100 100
2016-08-01 1 50 150
2017-01-12 1 80 230
2017-10-23 1 180 260
2018-03-03 1 0 180
2018-03-06 1 40 220
2019-03-16 1 500 500
2017-04-07 2 50 50
2017-04-09 2 230 280
2018-02-11 2 80 360
2018-05-12 2 0 80
2019-05-10 2 0 0

I tried the following:

df_3 = df_3.set_index(['customer', 'date']).sort_values(by='date')

rolling_sum = df_3.rolling('365d', on='date')["daily_total_per_customer"].sum()
df_3["rolling_total"] = rolling_sum

And I get the following error

ValueError: invalid on specified as date, must be a column (of DataFrame), an Index or None To recreate the code:

dates = ['2016-07-29',
'2016-08-01',
'2017-01-12',
'2017-10-23',
'2018-03-03',
'2018-03-06',
'2019-03-16',
'2017-04-07',
'2017-04-09',
'2018-02-11',
'2018-05-12',
'2019-05-10',
]


customer = [1,1,1,1,1,1,1,2,2,2,2,2]

daily_total = [100,50,80,180,0,40,500,50,230,80,0,0]


df = pd.DataFrame({'date': dates,
            'customer': customer,
            'daily_total_per_customer':daily_total,})

Perhaps someone can point me in the right direction. Thanks!

CodePudding user response:

Annotated code

# Parse the strings to datetime
df['date'] = pd.to_datetime(df['date'])

# Sort the dates in ASC order if not already sorted
df = df.sort_values(['customer', 'date']) 

# Group the dataframe by customer then for each group
# calculate rolling sum on 'daily_total_per_customer'
s = df.groupby('customer').rolling('365d', on='date')['daily_total_per_customer'].sum()

# Merge the result with original df
df.merge(s.reset_index(name='rolling_total'))

         date  customer  daily_total_per_customer  rolling_total
0  2016-07-29         1                       100            100
1  2016-08-01         1                        50            150
2  2017-01-12         1                        80            230
3  2017-10-23         1                       180            260
4  2018-03-03         1                         0            180
5  2018-03-06         1                        40            220
6  2019-03-16         1                       500            500
7  2017-04-07         2                        50             50
8  2017-04-09         2                       230            280
9  2018-02-11         2                        80            360
10 2018-05-12         2                         0             80
11 2019-05-10         2                         0              0
  • Related