Home > Back-end >  Compare a few columns for 2 different dates in same table, return what is different?
Compare a few columns for 2 different dates in same table, return what is different?

Time:08-30

I have a simple table that contains (a rolling) 2 BUSINESS_DATES of data. Each day I want to compare the data for each of those dates and return what has NEW and CHANGED. Unfortunately there is no primary key. A sample of the table looks like this.

BUSINESS_DATE    EMPLOYEE_ID       DEPT ACCESS       ACTIVE      CREATE_DATE      APPROVED_BY
-------------    ------------      --------    ------------     -------------     ------------
8/28/2022        1                 dept 1            Y           1/1/2022         john          
8/28/2022        1                 dept 3            Y           1/1/2022         john         
8/28/2022        1                 dept 6            Y           1/1/2022         john        
8/28/2022        2                 dept 1            Y           1/1/2022         john        
8/28/2022        2                 dept 2            Y           1/1/2022         john        
8/28/2022        2                 dept 3            Y           1/1/2022         john        
8/28/2022        2                 dept 4            Y           1/1/2022         john        

8/29/2022        1                 dept 1            Y           1/1/2022         john          
8/29/2022        1                 dept 3            Y           1/1/2022         john         
8/29/2022        1                 dept 6            N           1/1/2022         john        * CHANGED (active flag)    
8/29/2022        1                 dept 4            Y           1/1/2022         john        * NEW
8/29/2022        2                 dept 1            Y           1/1/2022         john        
8/29/2022        2                 dept 2            Y           1/1/2022         john        
8/29/2022        2                 dept 3            Y           1/1/2022         john        
8/29/2022        2                 dept 4            Y           1/1/2022         john        

So in the data above I'd like to return the 2 rows I've marked * and if possible be able to create a new column that identifies whether it's a NEW entry/row or a CHANGED row/entry.

I tried playing around with FULL OUTER JOINS but I wasn't getting the results I was hoping for. Any help would be GREATLY appreciated!! By the way using SQL 2016 in case that is relevant.

CodePudding user response:

One way you could consider is outer joining using apply and a checksum for comparing changes, such as:

with c as (
  select *, 
    Checksum(ACTIVE, CREATE_DATE, APPROVED_BY) chk, 
    Dense_Rank() over(order by BUSINESS_DATE) rnk
  from t
)
select c.BUSINESS_DATE, c.EMPLOYEE_ID, c.DEPT_ACCESS, c.ACTIVE, c.CREATE_DATE, c.APPROVED_BY, 
  Iif(rnk = 1, '', IsNull(d.NewStatus, 'New'))
from c
outer apply(
  select case when c2.chk = c.chk then '' else 'Changed' end
  from c c2
  where c2.BUSINESS_DATE < c.BUSINESS_DATE
    and c2.EMPLOYEE_ID = c.EMPLOYEE_ID
    and c2.DEPT_ACCESS = c.DEPT_ACCESS
)d(NewStatus)
order by BUSINESS_DATE, EMPLOYEE_ID;

CodePudding user response:

I don't believe you would need a full join if the latest date must have at least as many rows as the prior day:

select d0.*,
    case when d1.EMPLOYEE_ID is null then 'New'
         when d1.ACTIVE <> d0.ACTIVE or d1.CREATE_DATE <> d0.CREATE_DATE
           or d1.APPROVED_BY <> d0.APPROVED_BY then 'Changed'
         else 'Unmodified' end as Status
from T d0 left outer join T d1
    on     d1.EMPLOYEE_ID = d0.EMPLOYEE_ID and d1.DEPT = d0.DEPT
           /* this would readily handle weekend/holiday gaps */
       and d1.BUSINESS_DATE < d0.BUSINESS_DATE 
where d0.BUSINESS_DATE = cast(getdate() as date);
  • Related