I have a large table of customers and their policies that are set up like this (the actual dates can be throughout the year):
Record | CustomerID | Name | PolicyStart | PolicyEnd | Agent | Status |
---|---|---|---|---|---|---|
1 | 12345 | Jane Doe | 20060101 | 20081231 | John Smith | Term |
2 | 12345 | Jane Doe | 20090101 | 20101231 | Sarah Po | Term |
3 | 12345 | Jane Doe | 20110101 | Sarah Po | Active | |
4 | 12346 | Joe Dirt | 20130101 | 20141231 | Blake Lee | Term |
5 | 12346 | Joe Dirt | 20200101 | Blake Lee | Active |
I've been using a basic SQL query to get all the active accounts, by selecting the records where the status is "Active" and spitting out the list of active policies, like so:
SELECT * FROM Customers WHERE Status = 'Active'
Record | CustomerID | Name | PolicyStart | PolicyEnd | Agent | Status |
---|---|---|---|---|---|---|
3 | 12345 | Jane Doe | 20110101 | Sarah Po | Active | |
5 | 12346 | Joe Dirt | 20200101 | Blake Lee | Active |
However, I've been asked to now provide a result like this:
Record | CustomerID | Name | OriginalEffectiveDate | Agent | Status |
---|---|---|---|---|---|
3 | 12345 | Jane Doe | 20060101 | Sarah Po | Active |
5 | 12346 | Joe Dirt | 20200101 | Blake Lee | Active |
The specific ask is to calculate the OriginalEffectiveDate column based on the policy start and end dates using this criteria: Use the earliest policy start date as the customer's OriginalEffectiveDate, as long as there has been no break in coverage of one month or greater. If there has been a gap (or gaps), use the first date after the latest gap instead.
So, Jane's original effective date would be 20060101, as that's when she first got coverage, and she has not had any breaks in coverage. However, Joe's original effective date would be 20200101, as he had a break (one month or greater) after his initial policy, and that is the first date after the gap.
I've attempted to mess around with CASE, MIN, MAX, and TOP statements, to no avail. I have no idea how to proceed. I believe in "teaching someone to fish", so please feel free to point me in the direction of a tutorial. The sticking point is the "one month or greater" and "after the latest gap" parts. I tried to search for a solution online, and I haven't come across anything similar, but perhaps I don't know the right way to phrase the question.
CodePudding user response:
One possible solution you can try is as follows using lag.
First get the gap in months between the current PolicyEnd and the next PolicyStart, it's possible to then find the most recent record for each customer where the gap was greater than 1 month.
Then you can select the minimum valid record for each customer after this point.
with gaps as (
select *,
case when DateDiff(month, PolicyEnd,Lead(PolicyStart) over(partition by CustomerID order by Record)) > 1 then 1 else 0 end b
from t
), minrecord as (
select CustomerID, Max(Record) mr
from gaps
where b = 1
group by CustomerID
)
select Record, CustomerID, [Name], Agent, [Status], ed.OriginialEffectiveDate
from t
cross apply (
select Min(PolicyStart) OriginialEffectiveDate
from t t2
where t2.CustomerID = t.CustomerID and
t2.Record > IsNull((select mr from minrecord where CustomerID = t2.CustomerID), 0)
)ed
where t.[Status] = 'active';
See this demo Fiddle
CodePudding user response:
Here's what I ended up going with:
SELECT Record
,Customers.CustomerID
,Name
,CustomerEffectiveDate
,Agent
,Status
FROM Customers
JOIN (
SELECT CustomerID
,MIN(PolicyStart) AS CustomerEffectiveDate
FROM (
SELECT CustomerID
,PolicyStart
,DateDiff(month, PolicyEnd, Lead(PolicyStart) OVER (
PARTITION BY CustomerID ORDER BY CustomerID
,PolicyStart
)) AS gapdays
FROM Customers
) gaps
WHERE gapdays < 2
OR gapdays IS NULL
GROUP BY CustomerID
) minstart ON customers.CustomerID = minstart.CustomerID
WHERE status = 'Active'
It's probably not the most efficient code, but it seems to work.