How to delete rows (with geometry column) from a table in PostgreSQL (PostGIS) where the row has no-CodePudding

I have seen a strange behavior in PostgreSQL (PostGIS). I have two tables in PostGIS with geometry columns. one table is a grid and the other one is lines. I want to delete all grid cells that no line passes through them. In other words, I want to delete the rows from a table when that row has no spatial intersection with any rows of second table.

First, in a subquery, I find the ids of the rows that have any intersection. Then, I delete any row that its id is not in that returned list of ids.

DELETE FROM base_grid_916453354
WHERE id NOT IN 
(
     SELECT DISTINCT bg.id
     FROM base_grid_916453354 bg, (SELECT * FROM tracks_heatmap_1000 
     LIMIT 100000) tr
     WHERE bg.geom && tr.buffer
);

The following subquery returns in only 12 seconds

SELECT DISTINCT bg.id
FROM base_grid_916453354 bg, (SELECT * FROM tracks_heatmap_1000 LIMIT 
100000) tr
WHERE bg.geom && tr.buffer

, while the whole query did not return even in 1 hour!!

I ran explain command and it is the result of it, but I cannot interpret it:

How can I improve this query and why deleting from the returned list takes so much of time?

It is very strange because the subquery is a spatial query between 2 tables of 9 million and 100k rows, while the delete part is just checking a list and deleting!! In my mind, the delete part is much much easier.

CodePudding user response：

Don't post text as images of text!

Increase work_mem until the subplan becomes a hashed subplan.

Or rewrite it to use 'NOT EXISTS' rather than NOT IN

CodePudding user response：

I found a fast way to do this query. As @jjanes said, I used EXISTS() function:

DELETE FROM base_grid_916453354 bg
WHERE NOT EXISTS 
(
   SELECT 1
   FROM  tracks_heatmap_1000  tr
   WHERE bg.geom && tr.buffer
);

This query takes around 1 minute and it is acceptable for the size of my tables.