Resetting running count to zero at the end/start of every month in Python-CodePudding

I am creating a running count of claims and would like to reset the count to 0 at the start of every month.

I currently have these data frames:

DF1:

Date	X	Y
2021-04-25	4	4
2021-04-26	0	0
2021-04-27	0	0
2021-04-28	56	53
2021-04-29	0	0
2021-04-30	1	0
2021-05-01	0	0
2021-05-02	5	0
2021-05-03	5	5
2021-05-04	0	0

Df2:

Date	X	Y
2021-04-25	4	4
2021-04-26	0	0
2021-04-27	0	0
2021-04-28	56	53
2021-04-29	3	0
2021-04-30	4	0
2021-05-01	4	0
2021-05-02	9	0
2021-05-03	14	5
2021-05-04	9	0

In order to get to the df2 data frame the following code was used:

df2 = df1

df2['X'] = df1['X']   (df1['X'] - df1['Y']).cumsum().shift(1).fillna(0)

I am unsure how or where in this equation I should add the condition to reset count to 0 at the start of every month such that I have the below dataset.

df_final:

Date	X	Y
2021-04-25	4	4
2021-04-26	0	0
2021-04-27	0	0
2021-04-28	56	53
2021-04-29	3	0
2021-04-30	4	0
2021-05-01	0	0
2021-05-02	5	0
2021-05-03	10	5
2021-05-04	5	0

CodePudding user response：

This should work for you:

df = pd.DataFrame([
    ('2021-04-25', 4, 4),
    ('2021-04-26', 0, 0),
    ('2021-04-27', 0, 0),
    ('2021-04-28', 56, 53),
    ('2021-04-29', 0, 0),
    ('2021-04-30', 1, 0),
    ('2021-05-01', 0, 0),
    ('2021-05-02', 5, 0),
    ('2021-05-03', 5, 5),
    ('2021-05-04', 0, 0),
], columns=['Date', 'X', 'Y',])
df['Date'] = pd.to_datetime(df['Date'])

# Create Cumulative Sums for X 
df['X'] = (df['X'] - df['Y'].shift(fill_value=0)).cumsum()

result = pd.concat([
    group.assign(X=(
        group['X'] - group.loc[group['Date'].dt.is_month_start, 'X'].values[0]
        if len(group.loc[group['Date'].dt.is_month_start]) > 0
        else group['X']
    ))
    for month, group in df.groupby(df['Date'].dt.month)
])

Output:

        Date   X   Y
0 2021-04-25   4   4
1 2021-04-26   0   0
2 2021-04-27   0   0
3 2021-04-28  56  53
4 2021-04-29   3   0
5 2021-04-30   4   0
6 2021-05-01   0   0
7 2021-05-02   5   0
8 2021-05-03  10   5
9 2021-05-04   5   0

CodePudding user response：

Try:

Groupby(df[Date].dt.Month).agg(‘cumsum’)