We have a large table (2.8M rows) where we are finding a single row by our device_token
column
CREATE TABLE public.rpush_notifications (
id bigint NOT NULL,
device_token character varying,
data text,
created_at timestamp without time zone NOT NULL,
updated_at timestamp without time zone NOT NULL,
...
We are constantly doing the following query:
SELECT * FROM rpush_notifications WHERE device_token = '...' ORDER BY updated_at DESC LIMIT 1
I'd like to add a index for our device_token
column, and I'm wondering if there is any benefit to creating an additional index for updated_at
or creating a multicolumn index for both columns device_token
and updated_at
given that we are ordering by, i.e.:
CREATE INDEX foo ON rpush_notifications(device_token, updated_at)
I have been unable to find an answer that would help me understand if there would be any performance benefit to adding updated_at
to the index given the query we are running above. Any help appreciated. We are running Postgresql11
CodePudding user response:
There is a performance benefit if you combine both columns just like you did ((device_token, updated_at)
), because the database can easily find the entries with the specific device_token
and it does not need to do the sorting during the query.
Even better would be an index on (device_token, updated_at DESC)
as it gives you the requested row as the first one with this device_token
, so there is no need to get the first and start a sequential scan from there on to find the last.