Home > front end >  Index for self-join on timestamp range and user_id
Index for self-join on timestamp range and user_id

Time:12-03

I have a table in a postgresql (10.2) database something like this...

create table (user_id text, event_time timestamp, ...);

I'd like to use this table in a self join, to match records to other records from the same user_id and an event_time within the next 5 minutes. Something like this...

select
    *
from
    test as a
inner join
    test as b
on
    a.user_id = b.user_id
    and a.event_time < b.event_time
    and a.event_time > b.event_time - interval '5 minutes';

This works fine, but I'd ideally like to make it faster. I've gotten the join to use an index on user_id, but I'm wondering if it's possible to make an index to use both user_id AND the timestamp?

I've tried making a gist index on a tsrange from the event time to the event time plus 5 minutes, but Postgres seemed to just use the user_id index in that case. I tried making a multi-column index on the user_id and the tsrange, but that doesn't seem supported.

Finally, I tried making an index on just the timestamp.

None of that seemed to help.

However, the timestamps cover a long time period, and I'm only interested in a 5-minute window, which intuitively feels like something a good index should help with.

Can this be accomplished?

CodePudding user response:

A multi-column index on the user_id text and the event_time timestamp should work. A gist index on the range would need to include the user id as well, and it would be less versatile since it would work only with the fixed interval of 5 minutes. I wouldn't use it unless what you actually want is to establish an exclusion constraint on the table.

  • Related