Home > OS >  Updating multiple rows with a single command in postgres throws duplicate key
Updating multiple rows with a single command in postgres throws duplicate key

Time:07-02

Initial values:

 item | color
------ -------
 pen  | red
 ball | green
 pen  | blue

I am trying to update all my rows so that all items end up having the color green, if there's already an item with color green then I want the following:

  1. Do not remove the row getting updated from color X to green. (Tried with the command given below).

Expected output:

 item | color
------ -------
 pen  | green
 ball | green
 pen  | blue
  1. Delete the row getting updated from color X to green.

Expected output:

 item | color
------ -------
 pen  | green
 ball | green

Setup query:

CREATE TABLE my_table
(
    item varchar(50), 
    color varchar(50), 
    PRIMARY KEY (item, color)
);

INSERT INTO my_table (item, color)
VALUES 
    ('pen', 'red'),
    ('ball', 'green'),
    ('pen', 'blue');
 item | color
------ -------
 pen  | red
 ball | green
 pen  | blue

Update command:

UPDATE my_table a
SET color = 'green'
WHERE NOT EXISTS (SELECT 1 FROM my_table b 
                  WHERE b.item = a.item AND b.color = 'green');

I get this error:

ERROR: duplicate key value violates unique constraint "my_table_pkey"
DETAIL: Key (item, color)=(pen, green) already exists.

CodePudding user response:

For some reason, my dbfiddle is not working with Oracle DB right now. Therefore, I made dbfiddle example for PG and I will try to write the logic below.

  1. First of all - if we have a line already "green", then we do not need to update it again to "green" (more precisely, this can be done, but this affects performance). Also - that's a reason of duplicate key error (coz PK).

  2. If we have more than two rows with different colors for one item, then only the first one needs to be changed to "green" (you can add sorting, for example, by color name, ID or creating_time - up to you)

So update script would be like this:

-- update from table above by condition
WITH
    t_upd AS (SELECT item,
                     color,
                     CASE WHEN color = 'green' THEN 0 ELSE 1 END AS already_green,
                     ROW_NUMBER() OVER (PARTITION BY item ORDER BY CASE WHEN color = 'green' THEN 0 ELSE 1 END, color asc) AS rn
                  FROM my_table)

UPDATE my_table a
SET
    color = 'green'
    FROM t_upd t
    WHERE a.item = t.item
      AND a.color = t.color
      AND t.already_green != 0
      AND t.rn = CASE WHEN t.already_green = 0 THEN 2 ELSE 1 END;

Let's summarize: if the item contains a "green" color, we skip it, otherwise, we recolor any other color for it into "green" (the first one according to the sorting condition)

CodePudding user response:

In Postgres you can get the system column CTID which identifies the exact physical location of the row version(s) to be updated. Build a CTE which retrieves that system column then in the main update that specific row. (see demo)

with single(row_loc) as 
     ( select distinct on (item) 
              ctid  
         from my_table  mt1
        where not exists (select null 
                            from my_table mt2 
                           where mt2.item = mt1.item 
                             and mt2.color = 'green'
                         ) 
       order by item, color
     )                            
update my_table 
  set color = 'green' 
where ctid in (select row_loc from single); 
  • Related