I am trying to calculate cumsum for last 3 months for each row level. So, my main data frame looks like this
ID | Month | Level_1 |
---|---|---|
1 | AUG_15 | 1 |
1 | SEP_15 | 0 |
1 | OCT_15 | 1 |
1 | NOV_15 | 1 |
1 | DEC_15 | 0 |
1 | JAN_16 | 1 |
1 | FEB_16 | 1 |
1 | MAR_16 | 1 |
2 | AUG_15 | 1 |
2 | SEP_15 | 1 |
2 | OCT_15 | 1 |
2 | NOV_15 | 1 |
2 | DEC_15 | 1 |
2 | JAN_16 | 1 |
2 | FEB_16 | 1 |
2 | MAR_16 | 1 |
and my resultant desired output is
ID | Month | Level_1 | Level_1_m3 |
---|---|---|---|
1 | AUG_15 | 1 | 1 |
1 | SEP_15 | 0 | 1 |
1 | OCT_15 | 1 | 2 |
1 | NOV_15 | 1 | 2 |
1 | DEC_15 | 0 | 2 |
1 | JAN_16 | 1 | 2 |
1 | FEB_16 | 1 | 2 |
1 | MAR_16 | 1 | 3 |
2 | AUG_15 | 1 | 1 |
2 | SEP_15 | 1 | 2 |
2 | OCT_15 | 1 | 3 |
2 | NOV_15 | 1 | 3 |
2 | DEC_15 | 1 | 3 |
2 | JAN_16 | 1 | 3 |
2 | FEB_16 | 1 | 3 |
2 | MAR_16 | 1 | 3 |
so, basically the m3 columns looks at last three months from a particular and calculate cumsum. e.g. for Id 1 and month Mar_16, cumsum value is 3 as it is calculated using values of Mar_16, Feb_16 and Jan_16.
Is there are builtin method that can help achieve this in pandas?
CodePudding user response:
So, someone did posted an answer(got removed later) that I should explore pandas rolling method and it did the job.
Here is my current solution:-
df.groupby('ID')['Level_1'].rolling(3, min_periods=1).sum()
CodePudding user response:
Another way is using shift, if your data is already sorted
df["Level_1_m3"]=df["Level_1"] df["Level_1"].shift(1).fillna(0) df["Level_1"].shift(2).fillna(0)