I have this df:
CODE DATE PP
17594 000130 1991-01-01 0.5
17595 000130 1991-01-02 11
17596 000130 1991-01-03 1
17597 000130 1991-01-04 2
17598 000130 1991-01-05 5
17599 000130 1991-01-06 2
17598 000130 1991-01-07 5
17598 000130 1991-01-08 7
17598 000130 1991-01-09 5
17598 000130 1991-01-10 1
17598 000130 1991-01-11 5
17598 000130 1991-01-12 1
17598 000130 1991-01-13 5
17598 000130 1991-01-14 1
17598 000130 1991-01-15 5
17598 000130 1991-01-16 3
17598 000130 1991-01-17 5
17598 000130 1991-01-18 3
17598 000130 1991-01-19 5
17598 000130 1991-01-20 3
17598 000130 1991-01-21 5
17598 000130 1991-01-22 2
17598 000130 1991-01-23 5
17598 000130 1991-01-24 2
17598 000130 1991-01-25 5
17598 000130 1991-01-26 1
17598 000130 1991-01-27 2
17598 000130 1991-01-28 5
17598 000130 1991-01-29 3
17598 000130 1991-01-30 5
17598 000130 1991-01-31 4
... ... ... ...
28455 000130 1991-12-27 10
28456 000130 1991-12-28 12
28457 000130 1991-12-29 0
28458 000130 1991-12-30 0
28459 000130 1991-12-31 0
I want to get the cumulative sum of 10 days (3 per month) for the PP column. I'll give you and example:
For the month of April: 04-01 to 04-10, 04-11 to 04-20 and 04-21 to 04-30.
If the month has 31 days, eg January: 01-21 to 01-31 (11 days in the last cumulative sum).
If it has 28 days like February, from 02-21 to 02-28 (just 8 days in the last cumulative sum) and if it is a leap year, from 02-21 to 02-29 (9 days in the last cumulative sum).
Expected result:
CODE DATE PP
17594 000130 1991-01-01 39.5
17595 000130 1991-01-11 36
17596 000130 1991-01-21 39
17597 000130 1991-02-01 27
17598 000130 1991-02-11 30
17598 000130 1991-02-21 18
... ... ...
28455 000130 1991-11-01 22
28456 000130 1991-11-11 21
28457 000130 1991-11-21 12
28458 000130 1991-12-01 11
28459 000130 1991-12-11 10
28460 000130 1991-12-21 25
I tried this code but it sums the 10 day with no difference between the months with 31, 30, 29 and 28 days.
cumulative=df['PP'].resample('10D').sum().round(1)
Would you mind to help me?
Thanks in advance.
CodePudding user response:
I believe this should work:
s = df['DATE']
(df
.groupby([
s.dt.year,
s.dt.month,
s.dt.day.clip(upper=30).sub(1).floordiv(10)
], as_index=False)
.agg({'CODE':'first', 'DATE':'first', 'PP':'sum'}))
Output:
CODE DATE PP
0 130 1991-01-01 39.5
1 130 1991-01-11 36.0
2 130 1991-01-21 39.0