Home > Mobile >  Python Pandas - Cumulative sum by date (Month-Year) and unique user
Python Pandas - Cumulative sum by date (Month-Year) and unique user

Time:07-23

I would like to add the cumulative daily sum of the column ['PR'] by month-year while also grouping by ['PERMNO'].

Here is a snippet of my dataframe:

        PERMNO        date  PR
0        10025  2012-09-17   0
1        10025  2012-09-18   0
2        10025  2012-09-19   0
3        10025  2012-09-20   1
4        10025  2012-09-21   0
...        ...         ... ...
8567613  93436  2017-12-24   1
8567614  93436  2017-12-26   1
8567615  93436  2017-12-29   0
8567616  93436  2017-12-30   1
8567617  93436  2017-12-31   0

I would like to get something like this:

     PERMNO     date  PR
0     10025  2012-09  10
1     10025  2012-10   6
2     10025  2012-11   4
3     10025  2012-12  20
4     10073  2012-01   5
...     ...      ... ...
n-4   93436  2017-08  15
n-3   93436  2017-09  17
n-2   93436  2017-10   4
n-1   93436  2017-11  13
n     93436  2017-12  10

I'm mainly using Pandas and NumPy.

Thank you in advance.

CodePudding user response:

Here is the result when summing by PERMNO and year-month, though the expected result in the question does not match the provided data

df['date'] = pd.to_datetime(df['date'])
df.groupby(['PERMNO', df['date'].dt.strftime('%Y-%m') ]).sum().reset_index()
    PERMNO     date     PR
0   10025   2012-09     1
1   93436   2017-12     3
  • Related