Given a source
table that changes overtime, here I will define source_before
and source_after
to show the differences happening in the table, but in reality it's the same table
create table source_before (
binary_path varchar2(40),
hostname varchar2(40),
change_column varchar2(40),
flag varchar2(20) default 'open'
);
insert all
into source_before (binary_path,hostname,change_column) values ('python','a','drn')
into source_before (binary_path,hostname,change_column) values ('java','b','drn')
into source_before (binary_path,hostname,change_column) values ('apache','c','drn')
into source_before (binary_path,hostname,change_column) values ('spark','d','drn')
select * from dual;
create table source_after (
binary_path varchar2(40),
hostname varchar2(40),
change_column varchar2(40),
flag varchar2(20) default 'open'
);
insert all
into source_after (binary_path,hostname,change_column) values ('java','b','DMZ')
into source_after (binary_path,hostname,change_column) values ('apache','c','drn')
into source_after (binary_path,hostname,change_column) values ('NEW','NEW','NEW')
select * from dual;
--- source_before
python a open drn
java b open drn
apache c open drn
spark d open drn
--- source_after
java b open DMZ
apache c open drn
NEW NEW open NEW
The primary key is the combination (binary_path,hostname). Now given a destination table where I want to fetch changes that are happening in the source table.
create table destination (
binary_path varchar2(40),
hostname varchar2(40),
change_column varchar2(40),
flag varchar2(20),
creation_time timestamp,
open_close_timestamp timestamp
);
Initially the table is empty, and the flag
column should follow this logic:
- When the primary key combination is detected the first time, the
flag
shouldopen
which is the default from thesource_*
tables. - Whenever the primary key combination disappear from the
source_*
tables theflag
shouldclose
. - Whenever the primary key combination appears once again from the
source_*
tables theflag
which wasclose
shouldre-open
.
I managed to simulate this behavior with two statements:
merge into destination d
using (select * from source_before) s on (d.hostname = s.hostname and d.binary_path = s.binary_path)
when matched then update
set
d.change_column = s.change_column,
d.flag =
(
case
when d.flag = 'close' then 're-open'
else s.flag
end
)
when not matched then insert
(d.binary_path,d.hostname,d.change_column,d.flag,d.creation_time,d.open_close_timestamp)
values
(s.binary_path,s.hostname,s.change_column,s.flag,current_timestamp,current_timestamp)
;
UPDATE destination d
SET
flag = 'close',
open_close_timestamp = current_timestamp
WHERE NOT EXISTS (SELECT *
FROM source_after s
WHERE s.hostname = d.hostname and
d.binary_path = s.binary_path) and flag <> 'close'
;
I also managed to update the column open_close_timestamp
to get the latest timestamp when flag
updates to the close
state. However I did not manage to update open_close_timestamp
for the case when the flag
changes to open
or re-open
. The goal is not only to open
, re-open
and close
based on the appearance, re-appearance and disappearance of the combination of primary keys but also to timestamp whenever that modification happens. Note that in order to re-appear the flag can only be close
in its previous state.
CodePudding user response:
Try this one:
merge into destination d
using (
select s.binary_path, s.hostname, s.change_column, s.flag from source s
union all
select d.binary_path, d.hostname, d.change_column, 'close' from destination d
where not exists(select 1 from source s where s.binary_path = d.binary_path and s.hostname = d.hostname)
) s
on (d.hostname = s.hostname and d.binary_path = s.binary_path)
when matched then update
set
d.change_column = s.change_column,
d.flag = case when d.flag = 'close' then 're-open' else s.flag end,
d.open_close_timestamp = case when s.flag = 'close' or d.flag <> 'open' then current_timestamp else d.open_close_timestamp end
when not matched then insert
(d.binary_path,d.hostname,d.change_column,d.flag,d.creation_time,d.open_close_timestamp)
values
(s.binary_path,s.hostname,s.change_column,s.flag,current_timestamp,current_timestamp)
;
CodePudding user response:
You can use:
MERGE INTO destination d
USING source s
ON (d.hostname = s.hostname AND d.binary_path = s.binary_path)
WHEN MATCHED THEN
UPDATE
SET change_column = COALESCE(s.change_column, d.change_column),
flag = CASE
WHEN d.flag IN ('open', 'reopen')
THEN 'closed'
ELSE 'reopen'
END,
open_close_timestamp = SYSTIMESTAMP
WHEN NOT MATCHED THEN
INSERT (binary_path,hostname,change_column,flag,creation_time,open_close_timestamp)
VALUES (s.binary_path,s.hostname,s.change_column,s.flag,SYSTIMESTAMP,SYSTIMESTAMP);
If the source
table contains the source_before
rows then after the MERGE
the table contains:
BINARY_PATH | HOSTNAME | CHANGE_COLUMN | FLAG | CREATION_TIME | OPEN_CLOSE_TIMESTAMP |
---|---|---|---|---|---|
python | a | drn | open | 06-JAN-23 19.47.40.123768 | 06-JAN-23 19.47.40.123768 |
java | b | drn | open | 06-JAN-23 19.47.40.123768 | 06-JAN-23 19.47.40.123768 |
spark | d | drn | open | 06-JAN-23 19.47.40.123768 | 06-JAN-23 19.47.40.123768 |
apache | c | drn | open | 06-JAN-23 19.47.40.123768 | 06-JAN-23 19.47.40.123768 |
If the source
table is then changed to contains the source_after
rows then the MERGE
is run again the table will contain:
BINARY_PATH | HOSTNAME | CHANGE_COLUMN | FLAG | CREATION_TIME | OPEN_CLOSE_TIMESTAMP |
---|---|---|---|---|---|
python | a | drn | open | 06-JAN-23 19.47.40.123768 | 06-JAN-23 19.47.40.123768 |
java | b | DMZ | closed | 06-JAN-23 19.47.40.123768 | 06-JAN-23 19.47.40.190947 |
spark | d | drn | open | 06-JAN-23 19.47.40.123768 | 06-JAN-23 19.47.40.123768 |
apache | c | drn | closed | 06-JAN-23 19.47.40.123768 | 06-JAN-23 19.47.40.190947 |
NEW | NEW | NEW | open | 06-JAN-23 19.47.40.190947 | 06-JAN-23 19.47.40.190947 |
If the source
table still contains the source_after
rows then the MERGE
is run again the table will change to:
BINARY_PATH | HOSTNAME | CHANGE_COLUMN | FLAG | CREATION_TIME | OPEN_CLOSE_TIMESTAMP |
---|---|---|---|---|---|
python | a | drn | open | 06-JAN-23 19.47.40.123768 | 06-JAN-23 19.47.40.123768 |
java | b | DMZ | reopen | 06-JAN-23 19.47.40.123768 | 06-JAN-23 19.47.40.193350 |
spark | d | drn | open | 06-JAN-23 19.47.40.123768 | 06-JAN-23 19.47.40.123768 |
apache | c | drn | reopen | 06-JAN-23 19.47.40.123768 | 06-JAN-23 19.47.40.193350 |
NEW | NEW | NEW | closed | 06-JAN-23 19.47.40.190947 | 06-JAN-23 19.47.40.193350 |