Home > other >  Oracle SQL delete, insert records, keeping fixed quantity of rows ordered by date
Oracle SQL delete, insert records, keeping fixed quantity of rows ordered by date

Time:10-23

As the title says, I'm trying to delete & insert records to an Historical Table.

The Historical tbl_b has to receive records from a tbl_a, filtered by the column val_id; but there is a limit of records per each val_id, also it must keep only the most recent ones per val_id.

TBL_A
  • It has columns id_val, reg_date, flag
  • it has up to date records (constantly, records are inserted on this table).
  • only records with flag=1 should be inserted on TBL_B.
  • records are deleted by another scheduled process.
val_id reg_date flag
33 2022-10-20 23:00:00 1
22 2022-10-20 22:00:01 0
22 2022-10-20 22:00:02 1
11 2022-10-20 21:00:01 1
11 2022-10-20 21:00:02 1
11 2022-10-20 21:00:03 1
TBL_B:
  • It has columns id_val, reg_date
  • it should store 2 records per id_val and the most recent ones (order by reg_date).
  • it's partitioned monthly, this table will store 150 Million records aprox.
val_id reg_date
11 2022-10-19 11:00:01
11 2022-10-19 11:00:02
22 2022-10-19 12:00:01
22 2022-10-19 12:00:02
Desired Result on TBL_B:
val_id reg_date
11 2022-10-20 21:00:02
11 2022-10-20 21:00:03
22 2022-10-19 12:00:02
22 2022-10-20 22:00:02
33 2022-10-20 23:00:00

To approach this, I'm trying to do it in 2 steps:

  1. first delete records from TBL_B, if it's necessary.
  2. then insert records from TBL_A, since they are always the most recent ones.

But at this moment I'm stuck trying to filter the records that should be deleted from TBL_B.

Link: http://sqlfiddle.com/#!4/73271c/1

DELETE FROM tbl_b where rowid in (
SELECT
  rowid
FROM (SELECT
  m.*,
  ROW_NUMBER() OVER (
  PARTITION BY id_val
  ORDER BY
  reg_date DESC
  ) AS rownumb
FROM (SELECT
  h.*
FROM tbl_b h
LEFT JOIN (SELECT
  *
FROM (SELECT
  tbl_a.*,
  ROW_NUMBER() OVER (
  PARTITION BY id_val
  ORDER BY
  reg_date DESC
  ) AS seqnum
FROM tbl_a
WHERE flag = 1) f
WHERE f.seqnum <= 2) t
  ON t.id_val = h.id_val) m) n
WHERE n.rownumb > 2
);

Any help is appreciated.

CodePudding user response:

It looks like you have to use analytic functions DENSE_RANK() and ROW_NUMBER() in your SQL query in order to resolve the task:

DELETE FROM tbl_b 
 WHERE reg_date in (SELECT reg_date 
                      FROM (SELECT t.*
                                 , dense_rank() over (order by trunc(reg_date) desc) as d_rank
                                 , row_number() over (PARTITION BY TRUNC(REG_DATE) order by reg_date DESC) as r_num
                             
                              FROM tbl_b t)
                     WHERE d_rank = 2 and r_num <> 1);

dbfiddle

Note: DELETE has to run after insert into the records from TBL_A (the most recent ones)

It would be good idea - CREATE TRIGGER AFTER INSERT ON tbl_b to perform DELETE

CodePudding user response:

Actually, I'd rather do it just the opposite of what you wanted: I'd insert rows first, and delete superfluous rows next. Why? Because - if you're deleting rows first, you have to calculate how many rows to leave in TBL_B so that you'd insert the right number of rows from TBL_A so that TBL_B always contains up to 2 most recent rows. That's just too complex.

Also, indexes you currently have on these tables are kind of wrong; I'd index id_val and reg_date in both tables; indexing flag (at the first sight, unless there are many flags, not just e.g. 0 and 1) won't help much.

OK, here's my suggestion: initial TBL_B table contents:

SQL> select * from tbl_b order by id_val, reg_date;

ID_VAL REG_DATE
------ -------------------
11     2022-10-19 11:00:01
11     2022-10-19 11:00:02
22     2022-10-19 12:00:01
22     2022-10-19 12:00:02

SQL>

Insert first: fetch two most recent rows per id_val - that's the maximum rows you'd have if there are no rows in TBL_B at all:

SQL>     insert into tbl_b (id_val, reg_date)
  2        with temp as
  3          (select id_val,
  4                  reg_date,
  5                  rank() over (partition by id_val order by reg_date desc) rnk
  6           from tbl_a
  7           where flag = 1
  8          )
  9        select t.id_val, t.reg_date
 10        from temp t
 11        where t.rnk <= 2;

4 rows created.

Delete superfluous rows:

SQL> delete from tbl_b a
  2  where exists
  3         (select null
  4          from (select b.id_val,
  5                       b.reg_date,
  6                       rank() over (partition by b.id_val order by b.reg_date desc) rnk
  7                from tbl_b b
  8               ) x
  9          where x.id_val = a.id_val
 10            and x.reg_date = a.reg_date
 11            and x.rnk > 2
 12         );

3 rows deleted.

The final result:

SQL> select * from tbl_b order by id_val, reg_date;

ID_VAL REG_DATE
------ -------------------
11     2022-10-20 11:00:02
11     2022-10-20 11:00:03
22     2022-10-19 12:00:02
22     2022-10-20 22:00:02
33     2022-10-20 23:00:00

SQL>
  • Related