I have the following example Pandas DataFrame
df
UserID Total Date
1 20 2019-01-01
1 18 2019-01-02
1 22 2019-01-03
1 16 2019-01-04
1 17 2019-01-05
1 26 2019-01-06
1 30 2019-01-07
1 28 2019-01-08
1 28 2019-01-09
1 28 2019-01-10
2 22 2019-01-01
2 11 2019-01-02
2 23 2019-01-03
2 14 2019-01-04
2 19 2019-01-05
2 29 2019-01-06
2 21 2019-01-07
2 22 2019-01-08
2 30 2019-01-09
2 16 2019-01-10
3 27 2019-01-01
3 13 2019-01-02
3 12 2019-01-03
3 27 2019-01-04
3 26 2019-01-05
3 26 2019-01-06
3 30 2019-01-07
3 19 2019-01-08
3 27 2019-01-09
3 29 2019-01-10
4 29 2019-01-01
4 12 2019-01-02
4 25 2019-01-03
4 11 2019-01-04
4 19 2019-01-05
4 20 2019-01-06
4 33 2019-01-07
4 24 2019-01-08
4 22 2019-01-09
4 24 2019-01-10
What I'm trying to achieve is to add a column TotalPast3Days that is basically the sum of Total of the previous 3 days (excluding the current date in the row) for that particular UserID
How can this be done?
CodePudding user response:
For the first 3 days, you will get a NaN
because there are no "previous 3 days (excluding the current date in the row)"; but, for the rest, you can use shift
like df['TotalPast3Days'] = df['Date'].shift(1) df['Date'].shift(2) df['Date'].shift(3)
CodePudding user response:
totals = []
for i in len(df.index):
if i < 3:
totals.append(0)
elif df['UserID'].iloc[i] == df['UserID'].iloc[i-3]:
total = df['Total'].iloc[i-1]
df['Total'].iloc[i-2]
df['Total'].iloc[i-3]
totals.append(total)
else:
totals.append(0)
df['Sum of past 3'] = total