Home > Software design >  Can I sum a value over different times in python?
Can I sum a value over different times in python?

Time:05-31

I have a Dataframe that looks like this

WEEK DELIVERY_BOY_ID TOTAL_GMV
2022-04-04 999999999.0 470510.11
2022-04-11 999999999.0 557351.02
2022-04-18 999999999.0 454225.78
2022-04-25 999999999.0 527932.46
2022-05-02 999999999.0 556741.18
2022-05-09 999999999.0 524571.93
2022-05-16 999999999.0 547195.66
2022-05-23 999999999.0 112423.49

What I'm trying to do is to sum the TOTAL_GMV every 4 weeks (sum from week 2022-05-02 to 2022-05-23, sum from 2022-04-25 to 2022-05-16 and so on for every week), and to show me the result with the date of the last week that was summed

So, the final result should look something like this:

WEEK DELIVERY_BOY_ID TOTAL_GMV EXPLANATION
2022-04-04 999999999.0 ********* Sum from 2022-03-14 to 2022-04-04
2022-04-11 999999999.0 ********* Sum from 2022-03-21 to 2022-04-11
2022-04-18 999999999.0 ********* Sum from 2022-03-28 to 2022-04-18
2022-04-25 999999999.0 2.010.018,91 Sum from 2022-04-04 to 2022-04-25
2022-05-02 999999999.0 2.096.250,44 Sum from 2022-04-11 to 2022-05-02
2022-05-09 999999999.0 2.063.469,15 Sum from 2022-04-18 to 2022-05-09
2022-05-16 999999999.0 2.156.441,23 Sum from 2022-04-25 to 2022-05-16
2022-05-23 999999999.0 1.639.932,26 Sum from 2022-05-02 to 2022-05-23

Any idea how to do it?

Thanks!!

CodePudding user response:

Assuming that WEEK is the index, you can do:

>>> df.TOTAL_GMV.rolling(4).sum()
WEEK
2022-04-04           NaN
2022-04-11           NaN
2022-04-18           NaN
2022-04-25    2010019.37
2022-05-02    2096250.44
2022-05-09    2063471.35
2022-05-16    2156441.23
2022-05-23    1740932.26
Name: TOTAL_GMV, dtype: float64

To add it to the df,

df['TOTAL_GMV'] = df.TOTAL_GMV.rolling(4).sum()

(If it's not the index, change that to df.set_index('WEEK').TOTAL_GMV.rolling(4).sum().)

  • Related