Home > Blockchain >  Comparing specific rows per a group of records
Comparing specific rows per a group of records

Time:08-24

I have a table that I want to monitor for differences in counts based on a group of records.

In the table below records are grouped by the Source name (Customer, Product, Service), I want to check when the Total_Count of the ‘staging_’ and ‘delta_’ Entity columns is different.

For example for the Source - Customer, the Total_Count for the Entity staging_Customer and delta_Customer is the same so the output should return the difference 755 - 755 = 0, the same output should be there for the Source - Service 340 - 340 = 0.

However, for the Source - Product the staging_Product Total_Count and the delta_Product Total_Count are not the same so the output query should return the difference, so 240 - 0 = 240.

Each source always has 4 Entity records and the naming convention is always the same (hdp_[Source]sql, staging[Source], delta_[Source], final_[Source]).

Run_Date Source Process Entity Total_Count
20180101 Customer tr_Customer_Data hdp_Customer_sql 1500
20180101 Customer tr_Customer_Data staging_Customer 755
20180101 Customer tr_Customer_Data delta_Customer 755
20180101 Customer tr_Customer_Data final_Customer 755
20180101 Product tr_Product_Data hdp_Product_sql 570
20180101 Product tr_Product_Data staging_Product 240
20180101 Product tr_Product_Data delta_Product 0
20180101 Product tr_Product_Data final_Product 0
20180101 Service tr_Service_Data hdp_Service_sql 2300
20180101 Service tr_Service_Data staging_Service 340
20180101 Service tr_Service_Data delta_Service 340
20180101 Service tr_Service_Data final_Service 340

Expected output:

Run_Date Source Differences
20180101 Customer 0
20180101 Customer 240
20180101 Customer 0

CodePudding user response:

I filtered out unnecessary information using a where clause, and then I used lag to compare total_count of the entities in question.

select Run_Date
      ,Source
      ,Differences
from 
       (select Run_Date
              ,Source
              ,abs(Total_Count-lag(Total_Count) over(partition by source order by Entity)) as Differences
              ,row_number() over(partition by source order by Entity desc)                 as rn
        from t
        where Entity like '%staging%' or  
              Entity like '           
  • Related