I have a table with a trigger that sets an order number per customer, I constantly suffer a unique key in that table:
SQLSTATE[23505]: Unique violation: 7 ERROR: duplicate key value violates unique constraint "order_customer_unique"
The trigger is defined as BEFORE INSERT ON orders FOR EACH ROW
.
Is there any technique to solve this concurrency problem in the tables?
Ideally, the number per customer is a sequence and does not lose numbers.
CREATE OR REPLACE FUNCTION "public"."trigger_set_order_customer_number"()
RETURNS "pg_catalog"."trigger" AS
$BODY$
BEGIN
NEW.sequence = (
SELECT COALESCE( MAX ( order.sequence), 0 ) 1
FROM orders
WHERE orders.customer_type = NEW.customer_type
AND orders.customer_id = NEW.customer_id
AND orders.model_id = NEW.model_id LIMIT 1);
RETURN NEW;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100
CodePudding user response:
Before you proceed, consider not doing this. Gap-less sequential numbers per partition under concurrent write load have many implications and can become a costly endeavor. Much depends on details of your setup and requirements. See:
The trigger you have is inherently unsafe against concurrent writes. Conflicts are to be expected, as each transaction sees its own snapshot of the table, and the calculated "next number" may end up being the same for multiple concurrent inserts.
You could escalate to SERIALIZABLE
transaction isolation. But that's expensive and you have to prepare for serialization failures, just like you have to prepare for unique violations now. Catch 22.
A cheaper approach is to add a parent table with exactly one row per partition (per "unique customer account"). Then you can lock a row there so that each unique customer account can only be manipulated by a single transaction at a time. That's typically a viable path and performs better. But it all depends ... Related:
While being at it, we can put an order_count
into that parent table to make everything simpler, and cheaper. Could look like this:
CREATE TABLE unique_customer_account(
customer_id int -- replace with actual data types
, customer_type int
, model_id int
, order_count int NOT NULL DEFAULT 0
, PRIMARY KEY (customer_id, customer_type, model_id)
);
CREATE OR REPLACE FUNCTION public.trigger_set_order_customer_number()
RETURNS trigger
LANGUAGE plpgsql AS
$func$
BEGIN
-- this locks the row and makes concurrent transactions targeting
-- the same unique_customer_account wait till this one is done
INSERT INTO unique_customer_account AS u
( customer_type, customer_id, model_id, order_count)
VALUES (NEW.customer_type, NEW.customer_id, NEW.model_id, 1)
ON CONFLICT (customer_type, customer_id, model_id) DO UPDATE -- ①
SET order_count = u.order_count 1 -- ②
RETURNING u.order_count
INTO NEW.sequence; -- ③
IF NOT FOUND THEN
RAISE EXCEPTION 'Row in "unique_customer_account" not found!'; -- ④
END IF;
RETURN NEW;
END
$func$;
CREATE TRIGGER orders_insbef
BEFORE INSERT ON orders
FOR EACH ROW EXECUTE FUNCTION public.trigger_set_order_customer_number();
db<>fiddle here
A quick local test with concurrent inserts also works as advertised.
① UPDATE
implicitly takes out an exclusive lock on the row, same as SELECT FOR UPDATE
.
② No COALESCE
since order_count
is defined NOT NULL
.
③ We can conveniently assign the incremented counter into NEW.sequence
directly.
④ The row in unique_customer_account
must exist. I built an UPSERT into the trigger. Or you might handle it separately. Either way, there is another (less common) race condition if multiple concurrent transactions want to enter a new row in unique_customer_account
at the same time, which can result in no row from the UPSERT. Can be solved. See:
A simple and clean solution to that next problem would be to INSERT ... ON CONFLICT DO NOTHING
new rows in unique_customer_account
in a separate command before inserting any orders. Can be in the same transaction, but not a CTE of the same command.
Then you can assume the row in unique_customer_account
exists and reduce to a plain UPDATE
in the trigger. All race conditions gone. (Except exotic ones.)
Like I said, many intricate details ...
CodePudding user response:
YES! Do not use a trigger. Define your sequence
column as integer generated always as identity
. That eliminates the need for the trigger complexly as Postgres handles assigning the value; you do not - indeed you can not. Just FYI, assigning a value as MAX 1, is a virtually guaranteed to create duplicates. Unfortunately this will lose numbers as there will be gaps at some point. Generating gap less sequences is quite difficult, and for virtually no gain. Unless there is a legal it is best you just get use to it having missing values and move on.