I have a simple table that contains (a rolling) 2 BUSINESS_DATES of data. Each day I want to compare the data for each of those dates and return what has NEW and CHANGED. Unfortunately there is no primary key. A sample of the table looks like this.
BUSINESS_DATE EMPLOYEE_ID DEPT ACCESS ACTIVE CREATE_DATE APPROVED_BY
------------- ------------ -------- ------------ ------------- ------------
8/28/2022 1 dept 1 Y 1/1/2022 john
8/28/2022 1 dept 3 Y 1/1/2022 john
8/28/2022 1 dept 6 Y 1/1/2022 john
8/28/2022 2 dept 1 Y 1/1/2022 john
8/28/2022 2 dept 2 Y 1/1/2022 john
8/28/2022 2 dept 3 Y 1/1/2022 john
8/28/2022 2 dept 4 Y 1/1/2022 john
8/29/2022 1 dept 1 Y 1/1/2022 john
8/29/2022 1 dept 3 Y 1/1/2022 john
8/29/2022 1 dept 6 N 1/1/2022 john * CHANGED (active flag)
8/29/2022 1 dept 4 Y 1/1/2022 john * NEW
8/29/2022 2 dept 1 Y 1/1/2022 john
8/29/2022 2 dept 2 Y 1/1/2022 john
8/29/2022 2 dept 3 Y 1/1/2022 john
8/29/2022 2 dept 4 Y 1/1/2022 john
So in the data above I'd like to return the 2 rows I've marked * and if possible be able to create a new column that identifies whether it's a NEW entry/row or a CHANGED row/entry.
I tried playing around with FULL OUTER JOINS but I wasn't getting the results I was hoping for. Any help would be GREATLY appreciated!! By the way using SQL 2016 in case that is relevant.
CodePudding user response:
One way you could consider is outer joining using apply and a checksum for comparing changes, such as:
with c as (
select *,
Checksum(ACTIVE, CREATE_DATE, APPROVED_BY) chk,
Dense_Rank() over(order by BUSINESS_DATE) rnk
from t
)
select c.BUSINESS_DATE, c.EMPLOYEE_ID, c.DEPT_ACCESS, c.ACTIVE, c.CREATE_DATE, c.APPROVED_BY,
Iif(rnk = 1, '', IsNull(d.NewStatus, 'New'))
from c
outer apply(
select case when c2.chk = c.chk then '' else 'Changed' end
from c c2
where c2.BUSINESS_DATE < c.BUSINESS_DATE
and c2.EMPLOYEE_ID = c.EMPLOYEE_ID
and c2.DEPT_ACCESS = c.DEPT_ACCESS
)d(NewStatus)
order by BUSINESS_DATE, EMPLOYEE_ID;
CodePudding user response:
I don't believe you would need a full join if the latest date must have at least as many rows as the prior day:
select d0.*,
case when d1.EMPLOYEE_ID is null then 'New'
when d1.ACTIVE <> d0.ACTIVE or d1.CREATE_DATE <> d0.CREATE_DATE
or d1.APPROVED_BY <> d0.APPROVED_BY then 'Changed'
else 'Unmodified' end as Status
from T d0 left outer join T d1
on d1.EMPLOYEE_ID = d0.EMPLOYEE_ID and d1.DEPT = d0.DEPT
/* this would readily handle weekend/holiday gaps */
and d1.BUSINESS_DATE < d0.BUSINESS_DATE
where d0.BUSINESS_DATE = cast(getdate() as date);