I have a table in a postgresql (10.2) database something like this...
create table (user_id text, event_time timestamp, ...);
I'd like to use this table in a self join, to match records to other records from the same user_id and an event_time within the next 5 minutes. Something like this...
select
*
from
test as a
inner join
test as b
on
a.user_id = b.user_id
and a.event_time < b.event_time
and a.event_time > b.event_time - interval '5 minutes';
This works fine, but I'd ideally like to make it faster. I've gotten the join to use an index on user_id
, but I'm wondering if it's possible to make an index to use both user_id AND the timestamp?
I've tried making a gist index on a tsrange from the event time to the event time plus 5 minutes, but Postgres seemed to just use the user_id index in that case. I tried making a multi-column index on the user_id and the tsrange, but that doesn't seem supported.
Finally, I tried making an index on just the timestamp.
None of that seemed to help.
However, the timestamps cover a long time period, and I'm only interested in a 5-minute window, which intuitively feels like something a good index should help with.
Can this be accomplished?
CodePudding user response:
A multi-column index on the user_id
text and the event_time
timestamp should work. A gist index on the range would need to include the user id as well, and it would be less versatile since it would work only with the fixed interval of 5 minutes. I wouldn't use it unless what you actually want is to establish an exclusion constraint on the table.