I'm trying to update a table from another one, the request ends up successful but no rows are updated. Each table contains ~32M rows. I'm running on PostgreSQL 11.12.
Here's the 2 tables (i've removed columns not used in the request) :
CREATE TABLE IF NOT EXISTS public.sirene_geo
(
siret character varying(50) COLLATE pg_catalog."default" NOT NULL,
x numeric,
y numeric,
CONSTRAINT sirene_geo_etablissement_pkey PRIMARY KEY (siret)
)
CREATE TABLE IF NOT EXISTS public.sirene_eta
(
siret character varying(50) COLLATE pg_catalog."default" NOT NULL,
latitude numeric,
longitude numeric,
CONSTRAINT sirene_stock_etablissement_pk PRIMARY KEY (siret)
)
Update request :
UPDATE sirene_eta eta
SET longitude = x,
latitude = y
FROM sirene_geo geo
WHERE eta.siret = geo.siret
On PgAdmin (v5.4), it is indicated -1 for "Rows Affected" field.
Postgres is using hash join strategy to complete the update.
Also, there are less rows in sirene_geo than sirene_eta, still Postgres is building hash table on sirene_geo (thus leading to no match for some rows).
When I try an update with limit inside subquery table, it works but it is using nested loop strategy which is definitely not suitable to update the whole table.
Thanks !
CodePudding user response:
You are running out of space on your storage device. Make room on disk (or whatever you use as storage) before starting the big UPDATE
. Delete dispensable files (unrelated to the database). Or shrink your database somehow.
A plain VACUUM
might do the job. Or VACUUM FULL
(blocks concurrent access) to aggressively shrink physical storage. If you cannot afford to block, consider one of the non-blocking community tools. See:
VACUUM FULL
preferably not on sirene_eta
(the target table) which will reuse dead tuples in the UPDATE
anyway (after a plan VACUUM
). And make sure VACUUM
is not blocked by a long running transaction. See:
- What are the consequences of not ending a database transaction?
- Why does this PostgreSQL transaction give "WARNING: there is no transaction in progress"
Whatever else you do, if you don't expect that all targeted rows actually change, add a WHERE
condition to filter empty updates (at full cost!)
UPDATE sirene_eta eta
SET longitude = geo.x
, latitude = geo.y
FROM sirene_geo geo
WHERE eta.siret = geo.siret
AND (eta.longitude IS DISTINCT FROM geo.x -- !
OR eta.latitude IS DISTINCT FROM geo.y)
Might even fix your problem by reducing the work to be done (dramatically). (Turns out, it doesn't in your case.
See: