The business scenario here is to calculate the customer tenure with the service provider. Customer tenure is calculated based on below aspects:
- Oldest account start date to be taken for tenure calculation
- One Customer can have more than 1 active account at a given time
- Cooldown period is 6 months, i.e., if a customer has to stay as a customer, s/he has 6 months to open a new account with the provider after closing the account or should already have another account open before closing
- If the customer opens an account post 6 months then the tenure calculation happens from the new account open date
We can better understand this with an example: (values in bold are Customer since/tenure-start date)
Customer_ID | ACCT_SERIAL_NUM | ACCT_STRT_DT | ACCT_END_DT | COMMENTS |
---|---|---|---|---|
11111 | Account1 | 2000-01-20 | (null) | Customer already had an active account before closing the existing account |
11111 | Account2 | 2002-12-10 | 2021-09-22 | |
11111 | Account3 | 2021-10-22 | (null) |
Customer_ID | ACCT_SERIAL_NUM | ACCT_STRT_DT | ACCT_END_DT | COMMENTS |
---|---|---|---|---|
11112 | Account1 | 2000-01-20 | 2002-08-10 | Account closed but customer opened another account within cooling period of 6months |
11112 | Account2 | 2002-12-10 | 2021-09-22 | |
11112 | Account3 | 2021-10-22 | (null) |
Customer_ID | ACCT_SERIAL_NUM | ACCT_STRT_DT | ACCT_END_DT | COMMENTS |
---|---|---|---|---|
11113 | Account1 | 2000-01-20 | 2002-05-10 | Account closed but customer didn't open another account within cooling period of 6months |
11113 | Account2 | 2002-12-10 | 2021-09-22 | Hence this is the new customer tenure start date |
11113 | Account3 | 2021-10-22 | (null) |
The query I was trying (below) could possibly help me if the events occur sequentially (like in above 3 scenarios)
With dataset as (
SELECT Customer_ID, ACCT_SERIAL_NUM, ACCT_STRT_DT, ACCT_END_DT, COMMENTS,
CASE WHEN NVL(LEAD(ACCT_STRT_DT, 1) OVER(PARTITION BY Customer_ID ORDER BY ACCT_STRT_DT asc ) , SYSDATE-1 ) < ADD_MONTHS(nvl(acct_end_dt, SYSDATE), 6)
THEN 'Y' ELSE 'N' END as ACTV_FLG
FROM calc_customer_tenure ct
order by Customer_ID, ACCT_STRT_DT asc )
SELECT
Customer_ID, MIN(CASE WHEN FLAG = 'Y' THEN ACCT_STRT_DT ELSE NULL END) as CUST_TNUR
FROM (
SELECT ds.*,
CASE WHEN ACCT_END_DT is NULL
THEN 'Y' ELSE MIN(ACTV_FLG) OVER (PARTITION BY Customer_ID ORDER BY ACCT_STRT_DT asc ROWS between current row and unbounded following)
END as FLAG
from dataset ds )
GROUP BY Customer_ID ORDER BY Customer_ID ;
but fails for the below scenario: (which is an ideal real-world scenario)
Unfortunately the above code takes account3 as start date instead of taking account1:
Customer_ID | ACCT_SERIAL_NUM | ACCT_STRT_DT | ACCT_END_DT | COMMENTS |
---|---|---|---|---|
11114 | Account1 | 2000-01-20 | 2021-08-22 | Customer has closed this account(1) after subsequent account(2) is closed. But then has opened an account(3) within 6 months of closing the account(1) hence this is the tenure start date |
11114 | Account2 | 2002-12-10 | 2003-12-10 | |
11114 | Account3 | 2021-10-22 | (null) |
CodePudding user response:
Try this, gives the correct results for the dataset you supplied, but you should test more scenarii:
with data(customer_id,acct_serial_num,acct_strt_dt,acct_end_dt,comments) as
(
select '11111', 'Account1', to_date('2000-01-20','yyyy-mm-dd'), cast(NULL AS DATE), 'Customer already had an active account before closing the existing account' from dual union all
select '11111', 'Account2', to_date('2002-12-10','yyyy-mm-dd'), to_date('2021-09-22','yyyy-mm-dd'), '' from dual union all
select '11111', 'Account3', to_date('2021-10-22','yyyy-mm-dd'), cast(NULL AS DATE), '' from dual union all
select '11112', 'Account1', to_date('2000-01-20','yyyy-mm-dd'), to_date('2002-08-10','yyyy-mm-dd'), 'Account closed but customer opened another account within cooling period of 6months' from dual union all
select '11112', 'Account2', to_date('2002-12-10','yyyy-mm-dd'), to_date('2021-09-22','yyyy-mm-dd'), '' from dual union all
select '11112', 'Account3', to_date('2021-10-22','yyyy-mm-dd'), cast(NULL AS DATE), '' from dual union all
select '11113', 'Account1', to_date('2000-01-20','yyyy-mm-dd'), to_date('2002-05-10','yyyy-mm-dd'), 'Account closed but customer didn''t open another account within cooling period of 6months' from dual union all
select '11113', 'Account2', to_date('2002-12-10','yyyy-mm-dd'), to_date('2021-09-22','yyyy-mm-dd'), 'Hence this is the new customer tenure start date' from dual union all
select '11113', 'Account3', to_date('2021-10-22','yyyy-mm-dd'), cast(NULL AS DATE), '' from dual union all
select '11114', 'Account1', to_date('2000-01-20','yyyy-mm-dd'), to_date('2021-08-22','yyyy-mm-dd'), 'Customer has closed this account(1) after subsequent account(2) is closed. But then has opened an account(3) within 6 months of closing the account(1) hence this is the tenure start date' from dual union all
select '11114', 'Account2', to_date('2002-12-10','yyyy-mm-dd'), to_date('2003-12-10','yyyy-mm-dd'), '' from dual union all
select '11114', 'Account3', to_date('2021-10-22','yyyy-mm-dd'), cast(NULL AS DATE), '' from dual
),
datawc as (
select d.customer_id,acct_serial_num,acct_strt_dt, nvl(d.acct_end_dt, to_date('2999-12-31','yyyy-mm-dd')) as acct_end_dt,
nvl(add_months(acct_end_dt,6),to_date('2999-12-31','yyyy-mm-dd')) as cooldown_end_dt,
case when
acct_strt_dt < add_months(lag(acct_end_dt,1) over(partition by customer_id order by acct_strt_dt),6)
then 1 else 0 end as prev_within_cooldown,
case when
add_months(nvl(acct_end_dt, to_date('2999-12-31','yyyy-mm-dd')),6) > lead(acct_strt_dt,1) over(partition by customer_id order by acct_strt_dt)
then 1 else 0 end as next_within_cooldown,
d.comments
from data d
),
mergeddata as (
select customer_id, acct_serial_num, acct_strt_dt, acct_end_dt, prev_within_cooldown, next_within_cooldown, cooldown_end_dt, comments
from datawc d
match_recognize(
partition by customer_id
order by acct_strt_dt,acct_end_dt
measures first(acct_serial_num) as acct_serial_num, cooldown_end_dt as cooldown_end_dt,
first(prev_within_cooldown) as prev_within_cooldown, first(next_within_cooldown) as next_within_cooldown,
comments as comments, first(acct_strt_dt) as acct_strt_dt, max(acct_end_dt) as acct_end_dt
pattern( merged* str)
define merged as acct_end_dt >= next(acct_strt_dt)
)
)
select d.customer_id, min(acct_strt_dt) as tenure_dt
from mergeddata d
where next_within_cooldown = 1
group by d.customer_id
;
CUSTO TENURE_DT
----- -----------
11111 20-JAN-2000
11112 20-JAN-2000
11113 10-DEC-2002
11114 20-JAN-2000
CodePudding user response:
Thanks to Akina I was able to re-write the query to fit as required! Also thanks to P3Consulting for contributing! Really appreciate the support!
Re-posting the final SQL here for the Oracle which helped with my use case:
Below is using Recursive CTEs
WITH cte1 as (
SELECT customer_id, ACCT_STRT_DT, ACCT_END_DT,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY ACCT_STRT_DT) rn
FROM calc_customer_tenure
), cte2 (customer_id, ACCT_STRT_DT, ACCT_END_DT, rn, tenure_start_date, tenure_end_date) AS (
SELECT customer_id, ACCT_STRT_DT, ACCT_END_DT, rn,
ACCT_STRT_DT tenure_start_date,
ACCT_END_DT tenure_end_date
FROM cte1
WHERE rn = 1
UNION ALL
SELECT cte1.customer_id, cte1.ACCT_STRT_DT, cte1.ACCT_END_DT, cte1.rn,
CASE WHEN cte1.ACCT_STRT_DT > ADD_MONTHS(cte2.tenure_end_date, 6)
THEN cte1.ACCT_STRT_DT
ELSE cte2.tenure_start_date
END,
CASE WHEN cte1.ACCT_STRT_DT > ADD_MONTHS(cte2.tenure_end_date, 6)
THEN cte1.ACCT_END_DT
ELSE GREATEST(cte1.ACCT_END_DT, cte2.tenure_end_date)
END
FROM cte1
JOIN cte2 ON cte1.customer_id = cte2.customer_id AND cte1.rn = cte2.rn 1
)
SELECT customer_id, CASE WHEN ADD_MONTHS(NVL(tenure_end_date, SYSDATE), 6) < SYSDATE THEN NULL ELSE tenure_start_date END AS CUSTOMER_TENURE_START_DATE FROM (
SELECT
cte2.*, row_number() over (partition by customer_id order by rn desc) as rank_derv
FROM cte2 ) subset
WHERE rank_derv = 1
ORDER BY 1,2 ;
I am also posting one which may work in case of Oracle only (since it uses hierarchical query syntax):
WITH dataset_rnkd as (
SELECT CT.*, row_number() over (partition by customer_id order by ACCT_STRT_DT DESC) as row_rnk
from calc_customer_tenure CT
)
SELECT customer_id, MIN(ACCT_STRT_DT) as CUSTOMER_TENURE FROM (
SELECT * FROM dataset_rnkd
START WITH ADD_MONTHS(NVL(ACCT_END_DT, SYSDATE), 6) >= SYSDATE
CONNECT BY NOCYCLE PRIOR customer_id = customer_id AND PRIOR ACCT_STRT_DT <= ADD_MONTHS(NVL(ACCT_END_DT, SYSDATE), 6)
) DS
GROUP BY customer_id
ORDER BY customer_id ;