Home > OS >  How to avoid unnecessary updates when using on conflict with Postgres?
How to avoid unnecessary updates when using on conflict with Postgres?

Time:09-23

My use case involves syncing a table with an upstream source on a recurring schedule.

Each row has a unique identifier and other columns, and I want to make sure I'm inserting any new upstream rows, and updating any changed upstream rows. And there could be thousands of rows to sync.

But I'd like to avoid unnecessary updates where the row in the database doesn't differ from what's upstream.

Currently I'm using ON CONFLICT UPDATE like so:

INSERT INTO symbols (id, name, status) 
VALUES 
  (1, 'one', 'online'),
  (2, 'two', 'offline'),
  ...
ON CONFLICT (id) 
UPDATE SET (id, name, status) = (excluded.id, excluded.name, excluded.status)
RETURNING *

But this will write the updates even when nothing is changing. How should I tweak the UPDATE to performantly check and apply to rows that need it?

CodePudding user response:

You can add a where clause to only update those rows that are different.

INSERT INTO symbols (id, name, status) 
VALUES 
  (1, 'one', 'online'),
  (2, 'two', 'offline'),
  ...
ON CONFLICT (id) DO
UPDATE SET (id, name, status) = (excluded.id, excluded.name, excluded.status)
WHERE (symbols.id, symbols.name, symbols.status) IS DISTINCT FROM (excluded.id, excluded.name, excluded.status)
RETURNING *

However, this will only return the rows that are actually updated, which may impact how you use the returning clause.

  • Related