My table structure is as described in this post:
name | version | processed | processing | updated | ref_time
------ --------- ----------- ------------ ---------- ----------
abc | 1 | t | f | 27794395 | 27794160
def | 1 | t | f | 27794395 | 27793440
ghi | 1 | t | f | 27794395 | 27793440
jkl | 1 | f | f | 27794395 | 27794160
mno | 1 | t | f | 27794395 | 27793440
pqr | 1 | f | t | 27794395 | 27794160
Based on this answer, I am deriving a list of ref_time
values which I want to use as a basis for deleting 'old' entries from status_table
.
This is the query to generate the list of relevant ref_time
values:
WITH main AS
(
SELECT ref_time,
ROUND(AVG(processed::int) * 100, 1) percent
FROM status_table
GROUP BY ref_time ORDER BY ref_time DESC, percent DESC
)
SELECT ref_time FROM main WHERE percent=100 OFFSET 2;
For example this might return:
ref_time
----------
27794880
27794160
I can then use this to DELETE
all relevant entries in the status_table
:
DELETE FROM status_table
WHERE ref_time IN
(
WITH main AS
(
SELECT ref_time,
ROUND(AVG(processed::int) * 100, 1) percent
FROM status_table
GROUP BY ref_time ORDER BY ref_time DESC, percent DESC
)
SELECT ref_time FROM main WHERE percent=100 OFFSET 2
);
But I have another table named data_table
, which also has a ref_time
column, and I want to DELETE
entries from that table on the same basis, i.e. any rows having ref_time
in the above list.
How do I achieve this without duplicating the query used to generate the ref_time
list?
CodePudding user response:
You can use common table expressions:
with
ref as (
select ref_time
from status_table
group by ref_time
having bool_and(processed)
order by ref_time desc limit 2
),
del_ref as (
delete from status_table s
using ref r
where s.ref_time = r.ref_time
)
delete from data_table d
using ref r
where d.ref_time = r.ref_time
The first CTE,ref
, returns the list of timestamps that you want to delete from the two other tables. I attempted to simplify the logic: you seem to want the top 2 timestamps that are fully processed (note that offset
skips that many rows from the resultset, which is different than limit
).
The second CTE deletes from status_table
, and the last part of the query addresses data_table
.