Home > front end >  Oracle table A minus table B full analysis
Oracle table A minus table B full analysis

Time:08-03

I am loading Table_A daily containing a unique user_id and columns with various user info . I need to keep track of the daily changes for every load. This means i need new rows, deleted rows and any modified row. I also need to distinguish which row has been deleted/added/modified.

Before truncating Table_A, i copy all data to Table_A_prev_day. Then i produce the folowing:

insert into table_diff
/*minus to provide new rows and updates*/
select * from Table_A
minus 
select * from Table_A_prev_day
union
/*minus to provide deletions and modifications*/
select * from Table_A_prev_day
minus
select * from Table_A

Then to distinguish the differences:

/*get all modified rows with current and previous values and flag them as UPDATED*/
select tar.*,prv.*,'UPDATED' as diff_status 
from Table_A tar
inner join Table_A_prev_day prv
on tar.user_id = prv.user_id 
inner join on table_diff df
on tar.user_id = df.user_id
UNION
/*get deleted rows*/
select tar.*,prv.*,'DELETED' as diff_status 
from Table_A_prev_day prv
left join Table_A tar
on prv.user_id = tar.user_id 
where tar.user_id is null
UNION
/*get inserted rows*/
select tar.*,prv.*,'INSERTED' as diff_status 
from Table_A_prev_day prv
right join Table_A tar
on prv.user_id = tar.user_id 
where prv.user_id is null

So i'm not making much use of the Diff table, only for the modifications. Are there any other more efficient ways to go?

CodePudding user response:

I don't know your exact context, but maybe you should consider creating a kind of Logger package and write some database triggers to handle this. On the other hand this code could help you - hopefully.

WITH
    tbl_prev_day AS
        (
            Select 1 "ID", 'Data 1' "ID_NAME", 12 "ID_QUANTITY" From Dual Union All
            Select 2 "ID", 'Data 2' "ID_NAME", 10 "ID_QUANTITY" From Dual Union All
            Select 3 "ID", 'Data 3' "ID_NAME", 14 "ID_QUANTITY" From Dual Union All
            Select 4 "ID", 'Data 4' "ID_NAME", 12 "ID_QUANTITY" From Dual Union All
            Select 5 "ID", 'Data 5' "ID_NAME", 16 "ID_QUANTITY" From Dual
        ),
    tbl AS
        (
            Select 1 "ID", 'Data 1' "ID_NAME", 12 "ID_QUANTITY" From Dual Union All
--          Select 2 "ID", 'Data 2' "ID_NAME", 10 "ID_QUANTITY" From Dual Union All     -- DELETED ROW
            Select 3 "ID", 'Data 3' "ID_NAME", 14 "ID_QUANTITY" From Dual Union All
            Select 4 "ID", 'Data x' "ID_NAME", 12 "ID_QUANTITY" From Dual Union All     -- UPDATED ROW
            Select 5 "ID", 'Data 5' "ID_NAME", 16 "ID_QUANTITY" From Dual Union All     -- INSERTED ROW
            Select 6 "ID", 'Data 6' "ID_NAME", 11 "ID_QUANTITY" From Dual Union All     -- INSERTED ROW
            Select 7 "ID", 'Data 7' "ID_NAME", 13 "ID_QUANTITY" From Dual Union All     -- INSERTED ROW
            Select 8 "ID", 'Data 8' "ID_NAME", 11 "ID_QUANTITY" From Dual Union All     -- INSERTED ROW
            Select 9 "ID", 'Data 9' "ID_NAME", 10 "ID_QUANTITY" From Dual               -- INSERTED ROW
        )

Select
    Nvl(t.ID, t_prev.ID)  "ORIGINAL_ID",
    t.ID "ID",
    t.ID_NAME "ID_NAME",
    t.ID_QUANTITY "ID_QUANTITY",
    t_prev.ID "PREV_ID",
    t_prev.ID_NAME "PREV_ID_NAME",
    t_prev.ID_QUANTITY "PREV_ID_QUANTITY",
    CASE 
        WHEN t_prev.ID Is Null And t_prev.ID_NAME Is Null And t_prev.ID_QUANTITY Is Null THEN 'INSERTED'
        WHEN Nvl(t.ID_NAME, t_prev.ID_NAME) || To_Char(Nvl(t.ID_QUANTITY, t_prev.ID_QUANTITY)) <> Nvl(t_prev.ID_NAME, t.ID_NAME) || To_Char(Nvl(t_prev.ID_QUANTITY, t.ID_QUANTITY)) THEN 'UPDATED'
        WHEN t.ID Is Null And t.ID_NAME Is Null And t.ID_QUANTITY Is Null THEN 'DELETED'
    ELSE 'INTACT'
    END "STATUS",
    CASE 
        WHEN t_prev.ID Is Null And t_prev.ID_NAME Is Null And t_prev.ID_QUANTITY Is Null THEN 'This is new row in the table "tbl"'
        WHEN Nvl(t.ID_NAME, t_prev.ID_NAME) || To_Char(Nvl(t.ID_QUANTITY, t_prev.ID_QUANTITY)) <> Nvl(t_prev.ID_NAME, t.ID_NAME) || To_Char(Nvl(t_prev.ID_QUANTITY, t.ID_QUANTITY)) THEN 'Some or all columns in this row have changed'
        WHEN t.ID Is Null And t.ID_NAME Is Null And t.ID_QUANTITY Is Null THEN 'This row was deleted'
    ELSE 'no comment'
    END "COMMENT"
From
    tbl t
FULL OUTER JOIN
    tbl_prev_day t_prev ON (t_prev.ID = t.ID)
ORDER BY
    Nvl(t.ID, t_prev.ID)
--  
--  R e s u l t
--  
--  ORIGINAL_ID         ID ID_NAME ID_QUANTITY    PREV_ID PREV_ID_NAME PREV_ID_QUANTITY STATUS   COMMENT                                    
--  ----------- ---------- ------- ----------- ---------- ------------ ---------------- -------- --------------------------------------------
--            1          1 Data 1           12          1 Data 1                     12 INTACT   no comment                                   
--            2                                         2 Data 2                     10 DELETED  This row was deleted                         
--            3          3 Data 3           14          3 Data 3                     14 INTACT   no comment                                   
--            4          4 Data x           12          4 Data 4                     12 UPDATED  Some or all columns in this row have changed 
--            5          5 Data 5           16          5 Data 5                     16 INTACT   no comment                                   
--            6          6 Data 6           11                                          INSERTED This is new row in the table "tbl"           
--            7          7 Data 7           13                                          INSERTED This is new row in the table "tbl"           
--            8          8 Data 8           11                                          INSERTED This is new row in the table "tbl"           
--            9          9 Data 9           10                                          INSERTED This is new row in the table "tbl"          

The WITH clause is here just to generate some sample data and, as such, it is not the part of the answer.
The sample data containes some inserts, updates and deletions as well as unchanged data. Tables are joined using full outer join and then the statuses are handled using CASE. The only tricky part could be status "UPDATED" because I don't know how many fields and of what types should be tested for possible changes. Here i use concatination of charachter values (using To_Char if needed) but this could be done using multiple OR conditions too. As already said, don't know the context but you will see if it suits you or not. Regards...

CodePudding user response:

Why do you do this? If your goal is to create a log table then you can create a log table called "Table_A_log". You can insert(record) every transaction to this table before manipulating your table Table_A with an extra column like process type that only contains one of the values from "INSERT, UPDATE, DELETE".

Another possible solution is GoldenGate. If you have Oracle GoldenGate licence this is a one of use case of it. Expensive solution.

Simple and effective solution is triggers. With triggers database itself can record each manipulation on the target table to another table.

  • Related