Home > Blockchain >  Postgresql request not updating rows but returns successful
Postgresql request not updating rows but returns successful

Time:03-21

I'm trying to update a table from another one, the request ends up successful but no rows are updated. Each table contains ~32M rows. I'm running on PostgreSQL 11.12.

Here's the 2 tables (i've removed columns not used in the request) :

CREATE TABLE IF NOT EXISTS public.sirene_geo
(
    siret character varying(50) COLLATE pg_catalog."default" NOT NULL,
    x numeric,
    y numeric,
    CONSTRAINT sirene_geo_etablissement_pkey PRIMARY KEY (siret)
)

CREATE TABLE IF NOT EXISTS public.sirene_eta
(
    siret character varying(50) COLLATE pg_catalog."default" NOT NULL,  
    latitude numeric,
    longitude numeric,
    CONSTRAINT sirene_stock_etablissement_pk PRIMARY KEY (siret)
)

Update request :

UPDATE sirene_eta eta
SET longitude = x,
    latitude = y
FROM sirene_geo geo
WHERE eta.siret = geo.siret

On PgAdmin (v5.4), it is indicated -1 for "Rows Affected" field.
Postgres is using hash join strategy to complete the update. Also, there are less rows in sirene_geo than sirene_eta, still Postgres is building hash table on sirene_geo (thus leading to no match for some rows).

When I try an update with limit inside subquery table, it works but it is using nested loop strategy which is definitely not suitable to update the whole table.

Thanks !

CodePudding user response:

You are running out of space on your storage device. Make room on disk (or whatever you use as storage) before starting the big UPDATE. Delete dispensable files (unrelated to the database). Or shrink your database somehow.

A plain VACUUM might do the job. Or VACUUM FULL (blocks concurrent access) to aggressively shrink physical storage. If you cannot afford to block, consider one of the non-blocking community tools. See:

VACUUM FULL preferably not on sirene_eta (the target table) which will reuse dead tuples in the UPDATE anyway (after a plan VACUUM). And make sure VACUUM is not blocked by a long running transaction. See:

Whatever else you do, if you don't expect that all targeted rows actually change, add a WHERE condition to filter empty updates (at full cost!)

UPDATE sirene_eta eta
SET    longitude = geo.x
     , latitude  = geo.y
FROM   sirene_geo geo
WHERE  eta.siret = geo.siret
AND   (eta.longitude IS DISTINCT FROM geo.x  -- !
    OR eta.latitude  IS DISTINCT FROM geo.y)

Might even fix your problem by reducing the work to be done (dramatically). (Turns out, it doesn't in your case.

See:

  • Related