Home > Net >  Backup Postgres Table using pg_dump is there a workaround to include the --jobs option
Backup Postgres Table using pg_dump is there a workaround to include the --jobs option

Time:03-11

I am using postgres 12 and have a database thats 1TB in size. I am performing a big delete on a table that is 140 GB in size.

I am testing the process and am looking to do a pg_dump of the table and its contents however running as is takes approximately 33mins

pg_dump -d titan -t public.play > /backup/playBackup.sql

I know that pg_dump does include a --jobs option which given I have a 32 core machine could really utilise but that's to backup the database itself as opposed to a table in the database.

Is there a quicker way of backing up the table I need?

CodePudding user response:

The --jobs option, which works only with a "database" format dump, won't help you with a single table, because a single table is dumped by a single process.

You could of course start a couple of parallel COPY statements:

COPY (SELECT * FROM titan WHERE id % 5 = 0) TO '/path/titan0.csv' (FORMAT 'csv');
COPY (SELECT * FROM titan WHERE id % 5 = 1) TO '/path/titan1.csv' (FORMAT 'csv');
COPY (SELECT * FROM titan WHERE id % 5 = 2) TO '/path/titan2.csv' (FORMAT 'csv');
COPY (SELECT * FROM titan WHERE id % 5 = 3) TO '/path/titan3.csv' (FORMAT 'csv');
COPY (SELECT * FROM titan WHERE id % 5 = 4) TO '/path/titan4.csv' (FORMAT 'csv');

If you start these statements at the same time, you have a chance to get synchronized sequential scans and get done with a single read of the table. Then you can load those files in parallel.

If you need the table structure too, run these:

pg_dump --section=pre-data -t public.play titan
pg_dump --section=post-data -t public.play titan

First restore pre-data, then the data, then post-data.

  • Related