Home > Software design >  Executiung performant SQL queries equivalent to "nested deletes"
Executiung performant SQL queries equivalent to "nested deletes"

Time:06-19

Consider the following ERD for Order, Item and Article entities:

ERD for Order, Item and Article entities

I'd like to delete Orders satisfying a complex condition, and then delete Items associated to those Orders, and finally delete Articles associated with those Items.

Cascade deleting from Orders to Items is possible as Order is a parent of Item. But cascade deleting from Item to Article isn't possible as Item is not a parent of Article. Setting up a trigger is also not an option, since in general I don't want to delete the Article whenever an Item is removed - it should only happen in the context of this query.

I'm working with PostgreSQL, which supports DELETE ... RETURNING statements. Unfortunatly, the result cannot be nested like this:

DELETE FROM articles WHERE id IN
  (DELETE FROM items WHERE order_id IN
    (DELETE FROM orders WHERE complex_condition RETURNING id)
  RETURNING article_id)

What is the most efficient way to execute all three delete statements? Each table contains tens of millions of records, and the complex_condition is the most time-consuming part, so I prefer not to execute it more than once.

One idea I have is to create a temporary table like this:

CREATE TEMP TABLE id_of_order_to_be_deleted
AS
WITH cte1 AS (SELECT id FROM orders WHERE complex_condition)
SELECT *
FROM cte1;

And then use it to delete Orders and Items. This way, complex_condition is only evaluated once. But I think this is an overkill, and there should be a simpler solution.

CodePudding user response:

The result can be nested using a data modifying CTE:

with deleted_orders as (
  DELETE FROM orders 
  WHERE complex_condition 
  RETURNING id
), deleted_items as (
  DELETE FROM items 
  WHERE order_id IN (select id from deleted_orders)
  returning article_id
)
DELETE FROM articles 
WHERE id IN (select article_id from deleted_items);

You might want to try where exists condition as an alternative - sometimes those are faster than IN conditions.

  • Related