Postgresql - how can I clone a set of records but maintain a mapping between the original ids and th-CodePudding

I come from a SQL Server background and our team is migrating to Postgres (version 9.5).

We have a number of scripts that perform MERGE statements that essentially 'clone' rows in a table and insert them back into the same table with a new Id while maintaining a map between the cloned records and the records they were cloned from.

I'm having a hard time trying to replicate this behavior. I've tried a number of variations, but I still can't seem to find the right combination of temp tables and CTEs to get it right.

Here's an approximation of the latest version that doesn't work:


CREATE SCHEMA stackoverflow;

CREATE TABLE stackoverflow.clone_problem
(
    id bigserial PRIMARY KEY NOT NULL,
    some_id bigint NULL,
    some_other_id bigint NULL,
    modified_time timestamp NOT NULL DEFAULT now(),
    modified_by varchar(128) NOT NULL DEFAULT current_user
);

  
INSERT INTO stackoverflow.clone_problem
(
   id,
   some_id,
   some_other_id
)
VALUES (1,1,1)
      ,(2,2,2)
      ,(3,3,3);

;WITH sources
        AS 
        (
            SELECT 
                id as old_id,
                some_id,
                some_other_id
            FROM stackoverflow.clone_problem
            WHERE id = ANY('{1,3}')
        ),  
        inserts
        AS 
        (
            INSERT INTO stackoverflow.clone_problem
            (
             some_id,
             some_other_id
            )
            SELECT 
                s.some_id,
                s.some_other_id
            FROM sources s
            RETURNING id as new_id, s.id as old_id -- this doesn't work 
        )
        SELECT * from inserts;

The final select statement is the output I'm trying to capture--either from a RETURNING statement of by other means-- so we know which records were cloned and what their new Ids are. But the code above throws this error: error: missing FROM-clause entry for table "s".

I don't understand because 's' is in the FROM clause so the error seems counterintuitive to me. I'm sure I'm missing something dumb, but I just can't seem to figure how to get that final piece of information.

Any help would be greatly appreciated.

CodePudding user response：

I think your only chance is to generate the ID before you do the insert so that you have the mapping between old and new ID right away. This can be done by calling nextval() when retrieving the source rows, then providing that already generated ID during the INSERT

with sources as (
  SELECT id as old_id, 
         nextval(pg_get_serial_sequence('stackoverflow.clone_problem', 'id')) as new_id,
         some_id,
         some_other_id
  FROM clone_problem 
  WHERE id IN (1,3)
), inserts as (
  INSERT INTO clone_problem (id, some_id, some_other_id)
  SELECT s.new_id, 
         s.some_id,
         s.some_other_id
  FROM sources s
)
select old_id, new_id
from sources;

By using pg_get_serial_sequence you don't need know the name of the sequence directly.