I've a feed with 600k rows of property listings that I'd like to upsert daily into a table. But I'd also like to insert the old columns (or the whole row) into a different table in order to track the changes in the data. I need to know if, for example, how some property price evolved in the previous year.
The problem with the upsert is that is going to need all the fields to be specified, because I'd like to track the changes for every column so the "ON CONFLICT" clause would get too complicated:
INSERT INTO prop_listings (id, prop_type, price, rooms)
VALUES
(33, 'house', 60000, 4)
, (22, 'apartment', 30000, 2)
ON CONFLICT (id, prop_type) -- complicated clause
DO UPDATE SET
prop_type = EXCLUDED.price_usd
, price = EXCLUDED.volume_24h
...;
Another issue would be to track the old values (the columns being updated) in another table. I've read that this could be done with a TRIGGER.
So I'm thinking if that's the best way to do this or if there's anything else I'm missing.
CodePudding user response:
You don't list all columns in the ON CONFLICT
clause, just the id
(or whatever constitutes your primary key). So it's simpler:
INSERT INTO prop_listings AS p (id, prop_type, price, rooms)
VALUES
(33, 'house', 60000, 4)
, (22, 'apartment', 30000, 2)
ON CONFLICT (id) -- simple clause
DO UPDATE
SET prop_type = EXCLUDED.price_usd
, price = EXCLUDED.volume_24h
, rooms = EXCLUDED.rooms
WHERE (p.prop_type, p.price, p.rooms) IS DISTINCT FROM
(EXCLUDED.prop_type, EXCLUDED.price, EXCLUDED.rooms);
I added a WHERE
clause to skip empty updates that wouldn't change anything - also skipping below trigger. This can make a huge difference if most of your new row versions are unchanged. See:
To keep track of old row versions, you could create a backup table with identical structure (let's call it prop_backup
) and add a trigger like:
CREATE OR REPLACE FUNCTION trg_prop_backup()
RETURNS trigger
LANGUAGE plpgsql AS
$func$
BEGIN
INSERT INTO prop_backup -- table has same columns!
VALUES (OLD.*);
RETURN NEW;
END
$func$;
CREATE TRIGGER prop_listings_upd_bef
BEFORE UPDATE ON prop_listings
FOR EACH ROW EXECUTE FUNCTION trg_prop_backup();
Related: