Home > Enterprise >  How to get the cumulative sum of n days in different months?
How to get the cumulative sum of n days in different months?

Time:07-01

I have this df:

       CODE      DATE     PP
17594  000130 1991-01-01  0.5
17595  000130 1991-01-02  11
17596  000130 1991-01-03  1
17597  000130 1991-01-04  2
17598  000130 1991-01-05  5
17599  000130 1991-01-06  2
17598  000130 1991-01-07  5
17598  000130 1991-01-08  7
17598  000130 1991-01-09  5
17598  000130 1991-01-10  1
17598  000130 1991-01-11  5
17598  000130 1991-01-12  1
17598  000130 1991-01-13  5
17598  000130 1991-01-14  1
17598  000130 1991-01-15  5
17598  000130 1991-01-16  3
17598  000130 1991-01-17  5
17598  000130 1991-01-18  3
17598  000130 1991-01-19  5
17598  000130 1991-01-20  3
17598  000130 1991-01-21  5
17598  000130 1991-01-22  2
17598  000130 1991-01-23  5
17598  000130 1991-01-24  2
17598  000130 1991-01-25  5
17598  000130 1991-01-26  1
17598  000130 1991-01-27  2
17598  000130 1991-01-28  5
17598  000130 1991-01-29  3
17598  000130 1991-01-30  5
17598  000130 1991-01-31  4
...    ...    ...         ...
28455  000130 1991-12-27  10
28456  000130 1991-12-28  12
28457  000130 1991-12-29  0
28458  000130 1991-12-30  0
28459  000130 1991-12-31  0

I want to get the cumulative sum of 10 days (3 per month) for the PP column. I'll give you and example:

For the month of April: 04-01 to 04-10, 04-11 to 04-20 and 04-21 to 04-30.

If the month has 31 days, eg January: 01-21 to 01-31 (11 days in the last cumulative sum).

If it has 28 days like February, from 02-21 to 02-28 (just 8 days in the last cumulative sum) and if it is a leap year, from 02-21 to 02-29 (9 days in the last cumulative sum).

Expected result:

       CODE      DATE     PP
17594  000130 1991-01-01  39.5
17595  000130 1991-01-11  36
17596  000130 1991-01-21  39
17597  000130 1991-02-01  27
17598  000130 1991-02-11  30
17598  000130 1991-02-21  18
      ...        ...  ...
28455  000130 1991-11-01  22
28456  000130 1991-11-11  21
28457  000130 1991-11-21  12
28458  000130 1991-12-01  11
28459  000130 1991-12-11  10
28460  000130 1991-12-21  25

I tried this code but it sums the 10 day with no difference between the months with 31, 30, 29 and 28 days.

cumulative=df['PP'].resample('10D').sum().round(1)

Would you mind to help me?

Thanks in advance.

CodePudding user response:

I believe this should work:

s = df['DATE']
(df
 .groupby([
    s.dt.year,
    s.dt.month,
    s.dt.day.clip(upper=30).sub(1).floordiv(10)
    ], as_index=False)
 .agg({'CODE':'first', 'DATE':'first', 'PP':'sum'}))

Output:

   CODE       DATE    PP
0   130 1991-01-01  39.5
1   130 1991-01-11  36.0
2   130 1991-01-21  39.0
  • Related