I have a table, event
, with a column unique_time
of type timestamptz
. I need each of the values in unique_time
to be unique.
Given a timestamptz
input, input_time
, I need to find the minimum timestamptz
value that satisfies the following criteria:
- the result must be >=
input_time
- the result must not already be in
unique_time
I cannot merely add one microsecond to the greatest value in unique_time
, because I need the minimum value that satisfies the above criteria.
Is there a concise way to compute this as part of an insert or update to the event
table?
CodePudding user response:
I suggest a function with a loop:
CREATE OR REPLACE FUNCTION f_next_free(_input_time timestamptz, OUT _next_free timestamptz)
LANGUAGE plpgsql STABLE STRICT AS
$func$
BEGIN
LOOP
SELECT INTO _next_free _input_time
WHERE NOT EXISTS (SELECT FROM event WHERE unique_time = _input_time);
EXIT WHEN FOUND;
_input_time := _input_time interval '1 us';
END LOOP;
END
$func$;
Call:
SELECT f_next_free('2022-05-17 03:44:22.771741 02');
Be sure to have an index on event(unique_time)
. If the column is defined UNIQUE
or PRIMARY KEY
, that index is there implicitly.
Related:
- Can I make a plpgsql function return an integer without using a variable?
- Select rows which are not present in other table
- BREAK statement in PL/pgSQL
Since Postgres timestamps have microsecond resolution, the next free timestamp is at least 1 microsecond (interval '1 us'
) away. See:
Could also be a recursive CTE, but the overhead is probably bigger.
Concurrency!
Is there a concise way to compute this as part of an
INSERT
orUPDATE
to theevent
table?
The above is obviously subject to a race condition. Any number of concurrent transaction might find the same free spot. Postgres cannot lock rows that are not there, yet.
Since you want to INSERT
(similar for UPDATE
) I suggest INSERT .. ON CONFLICT DO NOTHING
instead in a loop directly. Again, we need a UNIQUE
or PRIMARY KEY
on unique_time
:
CREATE OR REPLACE FUNCTION f_next_free(INOUT _input_time timestamptz, _payload text)
LANGUAGE plpgsql AS
$func$
BEGIN
LOOP
INSERT INTO event (unique_time, payload)
VALUES (_input_time, _payload)
ON CONFLICT DO NOTHING;
EXIT WHEN FOUND;
_input_time := _input_time interval '1 us';
END LOOP;
END
$func$;
Adapt your "payload" accordingly.
A successful INSERT
locks the row. Even if concurrent transactions cannot see the inserted row yet, a UNIQUE
index is absolute.
(You could make it work with advisory locks ...)
CodePudding user response:
Ah, forgot about the approaches from my comment that would try to generate an (infinite) sequence of all microsecond timestamps following the $input_time
. There's a much simpler query that can generate exactly the timestamp you need:
INSERT INTO event(unique_time, others)
SELECT MIN(candidates.time), $other_values
FROM (
SELECT $input_time AS "time"
UNION ALL
SELECT unique_time 1 microsecond AS time
FROM event
WHERE unique_time >= $input_time
) AS candidates
WHERE NOT EXISTS (
SELECT *
FROM unique_time coll
WHERE coll.unique_time = candidates.time
);
However, I'm not sure how well Postgres can optimise this, the MIN
aggregate might load all the timestamps from event
that are larger than $input_time
- which might be fine if you always append events at the end, but still. A probably better alternative would be
INSERT INTO event(unique_time, others)
SELECT available.time, $other_values
FROM (
SELECT *
FROM (
SELECT $input_time AS "time"
UNION ALL
SELECT unique_time 1 microsecond AS time
FROM event
WHERE unique_time >= $input_time
) AS candidates
WHERE NOT EXISTS (
SELECT *
FROM unique_time coll
WHERE coll.unique_time = candidates.time
)
ORDER BY candidates.unique_time ASC
) AS available
ORDER BY available.time ASC
LIMIT 1;
This might (I don't know) still have to evaluate the complex subquery every time you insert something though, which would be rather inefficient if most of the input don't cause a collision. Also I have no idea how well this works under concurrent loads (i.e. multiple transactions running the query at the same time) and whether it has possible race conditions.
Alternatively, just use a WHILE
loop (in the client or PL/SQL) that attempts to insert the value until it succeeds and increments the timestamp on every iteration - see @Erwin Brandstetter's answer for that.