Home > OS >  Resetting running count to zero at the end/start of every month in Python
Resetting running count to zero at the end/start of every month in Python

Time:07-19

I am creating a running count of claims and would like to reset the count to 0 at the start of every month.

I currently have these data frames:

DF1:

Date X Y
2021-04-25 4 4
2021-04-26 0 0
2021-04-27 0 0
2021-04-28 56 53
2021-04-29 0 0
2021-04-30 1 0
2021-05-01 0 0
2021-05-02 5 0
2021-05-03 5 5
2021-05-04 0 0

Df2:

Date X Y
2021-04-25 4 4
2021-04-26 0 0
2021-04-27 0 0
2021-04-28 56 53
2021-04-29 3 0
2021-04-30 4 0
2021-05-01 4 0
2021-05-02 9 0
2021-05-03 14 5
2021-05-04 9 0

In order to get to the df2 data frame the following code was used:

df2 = df1

df2['X'] = df1['X']   (df1['X'] - df1['Y']).cumsum().shift(1).fillna(0)

I am unsure how or where in this equation I should add the condition to reset count to 0 at the start of every month such that I have the below dataset.

df_final:

Date X Y
2021-04-25 4 4
2021-04-26 0 0
2021-04-27 0 0
2021-04-28 56 53
2021-04-29 3 0
2021-04-30 4 0
2021-05-01 0 0
2021-05-02 5 0
2021-05-03 10 5
2021-05-04 5 0

CodePudding user response:

This should work for you:

df = pd.DataFrame([
    ('2021-04-25', 4, 4),
    ('2021-04-26', 0, 0),
    ('2021-04-27', 0, 0),
    ('2021-04-28', 56, 53),
    ('2021-04-29', 0, 0),
    ('2021-04-30', 1, 0),
    ('2021-05-01', 0, 0),
    ('2021-05-02', 5, 0),
    ('2021-05-03', 5, 5),
    ('2021-05-04', 0, 0),
], columns=['Date', 'X', 'Y',])
df['Date'] = pd.to_datetime(df['Date'])

# Create Cumulative Sums for X 
df['X'] = (df['X'] - df['Y'].shift(fill_value=0)).cumsum()

result = pd.concat([
    group.assign(X=(
        group['X'] - group.loc[group['Date'].dt.is_month_start, 'X'].values[0]
        if len(group.loc[group['Date'].dt.is_month_start]) > 0
        else group['X']
    ))
    for month, group in df.groupby(df['Date'].dt.month)
])

Output:

        Date   X   Y
0 2021-04-25   4   4
1 2021-04-26   0   0
2 2021-04-27   0   0
3 2021-04-28  56  53
4 2021-04-29   3   0
5 2021-04-30   4   0
6 2021-05-01   0   0
7 2021-05-02   5   0
8 2021-05-03  10   5
9 2021-05-04   5   0

CodePudding user response:

Try:

Groupby(df[Date].dt.Month).agg(‘cumsum’)

  • Related