Home > database >  UPSERT based on UNIQUE constraint with NULL values
UPSERT based on UNIQUE constraint with NULL values

Time:10-28

I have a Postgres table with a unique constraint on multiple columns, one of which can be NULL. I only every want to allow one record with NULL in that column for each combination.

create table my_table (
   col1 int generated by default as identity primary key,
   col2 int not null,
   col3 real,
   col4 int,
   constraint ux_my_table_unique unique (col2, col3)
);

I have an upsert query that I want to update col4 when it encounters a record with the same values in col2, col3:

insert into my_table (col2, col3, col4) values (p_col2, p_col3, p_col4)
on conflict (col2, col3) do update set col4=excluded.col4;

But the conflict is not firing when col3 is NULL. I have read about using triggers. What is the best solution to get the conflict to fire please?

CodePudding user response:

If you can find a value that can never legally exist in col3 (make sure with a check constraint), you could use a unique index:

CREATE UNIQUE INDEX ON my_table (
   col2,
   coalesce(col3, -1.0)
);

and use that in your INSERT:

INSERT INTO my_table (col2, col3, col4)
VALUES (p_col2, p_col3, p_col4)
ON CONFLICT (col2, coalesce(col3, -1.0))
DO UPDATE SET col4 = excluded.col4;

CodePudding user response:

NULL values are not considered equal to each other and thus never trigger a UNIQUE violation. That means, your current table definition does not do what you say it should do. There can already be multiple rows with (col2, col3) = (1, NULL). ON CONFLICT never fires for col3 IS NULL in your current setup.

A UNIQUE NULLS [NOT] DISTINCT option for UNIQUE constraints is in development. But that will be in future Postgres 15 at best.

You can enforce your UNIQUE constraint with two partial UNIQUE indexes as instructed here:

Applied to your case:

CREATE UNIQUE INDEX my_table_col2_uni_idx ON my_table (col2)
WHERE col3 IS NULL;

CREATE UNIQUE INDEX my_table_col2_col3_uni_idx ON my_table (col2, col3)
WHERE col3 IS NOT NULL;

But ON CONFLICT ... DO UPDATE can only be based on a single UNIQUE index or constraint. Only the ON CONFLICT DO NOTHING variant works as "catch-all". See:

It would seem like what you want is currently impossible ...

Perfect solution

There is a perfect solution, though. With the two partial UNIQUE indexes in place, you can use the right statement based on the input value of col3:

WITH input(col2, col3, col4) AS (
   VALUES
     (3, NULL::real, 5)  -- ①
   , (3, 4, 5)
   )
, upsert1 AS (
   INSERT INTO my_table AS t(col2, col3, col4)
   SELECT * FROM input          WHERE col3 IS NOT NULL
   ON     CONFLICT (col2, col3) WHERE col3 IS NOT NULL  -- matching index_predicate!
   DO     UPDATE
   SET    col4 = EXCLUDED.col4
   WHERE  t.col4 IS DISTINCT FROM EXCLUDED.col4  -- ②
   )
INSERT INTO my_table AS t(col2, col3, col4)
SELECT * FROM input    WHERE col3 IS NULL
ON     CONFLICT (col2) WHERE col3 IS NULL  -- matching index_predicate!
DO     UPDATE SET col4 = EXCLUDED.col4
WHERE  t.col4 IS DISTINCT FROM EXCLUDED.col4;  -- ②

db<>fiddle here

Works in every case.
Even works for multiple input rows with an arbitrary mix of NULL and NOT NULL values for col3.
And doesn't even cost much more than the plain statement because each row only enters into one of the two UPSERTs.

This is one of those "Eurika!" queries where everything just clicks, against all odds. :)

① Note the explicit cast to ::real in the CTE input. This related answer explains why:

② The final WHERE clause is optional, but highly recommended. It would be a waste to go through with the UPDATE if it doesn't actually change anything. See:

  • Related