Home > Blockchain >  Huge delete on PostgreSQL table : Deleting 99,9% of the rows of the table
Huge delete on PostgreSQL table : Deleting 99,9% of the rows of the table

Time:10-29

I have a table in my PostgreSQL database that became huge, filled with a lot of useless rows. As these useless rows represent 99.9% of my table data (about 3.3M rows), I was wondering if deleting them could have a bad impact on my DB :

I know that this operation could take some time and I will be able to block writes on the table during the maintenance operation But I was wondering if this huge change in the data could also impact performance after the opertation itself.

I found solutions like creating a new table / using TRUNCATE to drop all lines but as this operation will be specific and one shot, I would like to be able to choose the most adapted solution.

I know that Postgre SQL has a VACUUM mechanism but I'm not a DBA expert : Could anyone please confirm that this delete will not impact my table integrity / data structure and that freed space will be reclaimed if needed for new data ?

PostgreSQL 11.12, with default settings on AWS RDS. I don't have any index on my table and the criteria for rows deletion will not be based on the PK

CodePudding user response:

Deleting rows typically does not shrink a PostgreSQL table, sou you would then have to run VACUUM (FULL) to compact it, during which the table is inaccessible.

If you are deleting many rows, both the DELETE and the VACUUM (FULL) will take a long time, and you would be much better off like this:

  • create a new table that is defined like the old one

  • INSERT INTO new_tab SELECT * FROM old_tab WHERE ... to copy over the rows you want to keep

  • drop foreign key constraints that point to the old table

  • create all indexes and constraints on the new table

  • drop the old table and rename the new one

By planning that carefully, you can get away with a short down time.

  • Related