Say I have a table of marbles
id | color | total |
---|---|---|
1 | blue | 5 |
2 | red | 10 |
3 | swirly | 3 |
and I need to put them into bags with a unique constraint on (bag_id, marble_id)
:
bag_id | marble_id | quantity |
---|---|---|
1 | 1 (blue) | 2 |
1 | 2 (red) | 3 |
2 | 1 (blue) | 2 |
I have a query for bagging at most the number of remaining marbles
WITH unbagged AS (
SELECT
marble.total - COALESCE( SUM( bag.quantity ), 0 ) AS quantity
FROM marble
LEFT JOIN bag ON marble.id = bag.marble_id
WHERE marble.id = :marble_id
GROUP BY marble.id )
INSERT INTO bag (bag_id, marble_id, quantity)
SELECT
:bag_id,
:marble_id,
LEAST( :quantity, unbagged.quantity )
FROM unbagged
ON CONFLICT (bag_id, marble_id) DO UPDATE SET
quantity = bag.quantity
LEAST(
EXCLUDED.quantity,
(SELECT quantity FROM unbagged) )
which works great until one day, it gets called twice at exactly the same time with the same item and I end up with 6 swirly marbles in a bag (or maybe 3 each in 2 bags), even though there are only 3 total.
I think I understand why, but I don't know how to prevent this from happening?
CodePudding user response:
Your algorithm isn't exactly clear to me, but the core issue is concurrency.
Manual locking
Your query processes a single given row in table marble
at a time. The cheapest solution is to take an exclusive lock on that row (assuming that's the only query writing to marble
and bag
). Then the next transaction trying to mess with the same kind of marble has to wait until the current one has committed (or rolled back).
BEGIN;
SELECT FROM marble WHERE id = :marble_id FOR UPDATE; -- row level lock
WITH unbagged AS ( ...
COMMIT;
SERIALIZABLE
Or use serializable transaction isolation, that's the more expensive "catch-all" solution - and be prepared to repeat the transaction in case of a serialization error. Like:
BEGIN ISOLATION LEVEL SERIALIZABLE;
WITH unbagged AS ( ...
COMMIT;
Related: