Check for 3 consecutive declined payment dates based on name-CodePudding

Jim and Bob are part of a loan repayment program. They need to be making monthly payments to pay off their loans. Sometimes, a payment can be declined for various reasons. I would like to find when there are 3 consecutive declined payments in a row(so no complete payment in-between any of the 3).

Here are the dataframes:

DF1
| Name | Test | Due Date               |
| -----| ---- |------------------------|
| Jim  | 1    |   2020-05-10           |
| Bob  | 2    |   2021-06-11           |

Here we have the "payment" dataframe.
DF2
| Name | Payment Date | Declined/Complete  |
| -----|------------  | -------------------|
| Jim  | 2020-03-9    | declined           |
| Jim  | 2020-04-5    | declined           |
| Jim  | 2020-05-6    | declined           |
| Bob  | 2021-03-20   | declined           |
| Bob  | 2021-04-11   | complete           |
| Bob  | 2021-05-11   | declined

Because Jim has had 3 consecutive declined payments before his due date, he gets flagged(1).     
Bob had a complete payment in-between his last 3 payments, so he does not get flagged(0).

Expected Output
| Name | Due Date    | 3 consecutive declines    | 
| -----|-------------| --------------------------|
| Jim  |2020-05-10   |1                          |
| Bob  |2022-03-10   |0                          |

I was thinking of utilizing joins or some group by counts, but I can't seem to figure out how it would all fit together.

Edit: Also, the 3 consecutive declines need to be before the due date.

CodePudding user response：

You can use a groupby rolling:

(df2['Declined/Complete']
 .eq('declined')
 .groupby(df['Name'])
 .apply(lambda g: g.rolling(3).sum().eq(3).any()).astype(int)
)

Output:

Name
Bob    0
Jim    1
Name: Declined/Complete, dtype: int64

To filter the dates and assign in df1, use:

df1['Due Date'] = pd.to_datetime(df1['Due Date'])
df2['Payment Date'] = pd.to_datetime(df2['Payment Date'])

df1['3 cons. dec.'] = df1['Name'].map(df2
 .loc[df2['Payment Date'].le(df2['Name']
         .map(df1.set_index('Name')['Due Date'])),
      'Declined/Complete']
 .eq('declined')
 .groupby(df2['Name'])
 .apply(lambda g: g.rolling(3).sum().eq(3).any()).astype(int)
)

output:

  Name  Test   Due Date  3 cons. dec.
0  Jim     1 2020-05-10             1
1  Bob     2 2021-06-11             0