I upgraded from Postgres 10 to Postgres 14 using pg_upgrade --link
option. The total databases size is around 10TB. The pg_upgrade was successful and quick and like the tool suggested -
Optimizer statistics are not transferred by pg_upgrade. Once you start the new server, consider running: /usr/pgsql-14/bin/vacuumdb --all --analyze-in-stages
I ran the above command but the process is stuck. As a side effect of this (or not, not sure) when I create a publication the prompt never comes back and the publication is not created even after many hours.
postgres=# select * from pg_stat_progress_vacuum;
c1 | c2 |
---|---|
pid | 9520 |
datid | 16402 |
datname | xyz |
relid | 22423 |
phase | vacuuming indexes |
heap_blks_total | 232816470 |
heap_blks_scanned | 36766348 |
heap_blks_vacuumed | 0 |
index_vacuum_count | 0 |
max_dead_tuples | 11184809 |
num_dead_tuples | 11184521 |
This is the same output from yesterday. What can I do to speed up this and the "create publication" command? On a side note: The VM on which Postgres is running is quite powerful (64GB RAM, 16 cores). Thanks!
edit 1: the output of pg_stat_activity for the same pid,
c1 | c2 |
---|---|
pid | 9520 |
backend_start | 2021-12-06 15:13:23.479071-08 |
xact_start | 2021-12-06 15:13:23.512581-08 |
query_start | 2021-12-06 15:13:23.512581-08 |
state_change | 2021-12-06 15:13:23.512581-08 |
wait_event_type | Timeout |
wait_event | VacuumDelay |
state | active |
backend_xmin | 3140627534 |
query | autovacuum: VACUUM xyz (to prevent wraparound) |
backend_type | autovacuum worker |
CodePudding user response:
vacuumdb --all --analyze-in-stages
will not run VACUUM
, but ANALYZE
, so you have to look into pg_stat_progress_analyze
to see how it is doing.
The VACUUM
process you see running is unrelated to that. It is an anti-wraparound vacuum that is currently sleeping, but otherwise processing. Let it finish; this process is important for the health of your database. If you want further autovacuum runs on that table to complete faster, reduce autovacuum_vacuum_cost_delay
for that table.
CodePudding user response:
Simply going the upgrade should not have caused the anti-wraparound vacuum to run, so that it decided to run right after the upgrade may be a coincidence of timing. On the other hand, maybe your new database version has different config settings, for example a lower value for autovacuum_freeze_max_age
, running with this new setting is what triggered it to kick in right now. Did you transfer all of your non-default config settings from v10 to v14?
I agree with Laurenz that do need to let this finish, but that doesn't mean you need to let it finish right now. You could terminate the vacuuming backend so that your CREATE PUBLICATION has a chance to run. The autovac might promptly restart, so you should have the CREATE PUBLICATION already trying to run when you cancel the vacuum. That way it will be able to acquire the lock before the vacuum can start up again and grab the lock again. But do make sure you don't get in the habit of cancelling vacuums every time they inconvenience you.
Also, you should increase maintenance_work_mem by a lot. It looks like it is currently set at 64MB, which is quite low for the server you describe. If you set it to 1GB, then it should be able to vacuum the entire table with only one pass over the indexes rather than the seven you are currently on pace for. I would change this setting in the conf file and SIGHUP the server before you cancel the vacuum, that way the new vacuum that gets launched should have the new setting.
Finally, I don't know why this is vacuuming the index. I thought v14 changed it so that emergency vacuums didn't bother to do that, but just froze tuples in the heap and left indexes for later. I guess I need to study v14 a bit more to figure out what it really does here.