Home > Software design >  Postgres cursor update each row in separate transaction
Postgres cursor update each row in separate transaction

Time:04-01

I have a million-row table in Postgres 13 that needs a one-time update of each row: the (golang) script will read the current column value for each row, transform it, then update the row with the new value, for example:

DECLARE c1 CURSOR FOR SELECT v FROM users;

FETCH c1;
-- read and transform v
UPDATE users SET v = ? WHERE CURRENT OF c1;
-- transaction committed

FETCH c1;
...

I'm familiar with cursors for reading, but have a few requirements for writing that I'm struggling to find the right settings for:

  1. I don't want it all to run in a single huge transaction, which is the default with cursors, since the change set will be large and it will take a while. I'd rather each update be its own transaction, and I can re-run the idempotent script again if it fails for any reason. I'm aware of DECLARE WITH HOLD to have the cursor span transactions, but...

  2. By default the data read by the cursor is "insensitive" (a snapshot from when the cursor was first created), but I would like the latest data for each row with FETCH in case there has been a subsequent update. The solution to that is to use FOR UPDATE in the cursor query to make it "sensitive," but that is not allowed together with WITH HOLD. I would prefer the row lock you get with FOR UPDATE to prevent the read-then-write race condition between FETCH and UPDATE, but it's not mandatory

How can I iterate all rows and update them one at a time without having to read everything into memory first?

CodePudding user response:

Make the cursor be WITH HOLD, but select the pk rather than v. Then in the loop, select the now-current v from the table based the pk (rather than current of main), and update it using the pk.

  • Related