Below is my sql that I am trying to run. But while I am executing this, the cost is very high. How can I optimize this query? This query also sometimes takes more than 6 minutes to execute. The table that I am querying has more than 1 million data.
SELECT * FROM application app
WHERE app.user_id = '123456*******'
AND app.updated_at is not null
AND app.status IN ('YES', 'NO')
AND app.amount > 0
ORDER BY app.created_at desc limit 1
Production:
Limit (cost=0.56..2017.59 rows=1 width=982) (actual time=421864.746..421864.748 rows=1 loops=1)
-> Index Scan Backward using application_created_at on application app (cost=0.56..7533571.71 rows=3735 width=982) (actual time=421864.745..421864.745 rows=1 loops=1)
Filter: ((updated_at IS NOT NULL) AND ((status)::text = ANY ('{YES,NO}'::text[])) AND (amount > '0'::numeric) AND ((user_id)::text = '123456*****'::text))
Rows Removed by Filter: 27483657
Planning time: 0.194 ms
Execution time: 421864.773 ms
I have also added the below index, but somehow its not using this index and using another index application_created_at.
CREATE INDEX Concurrently application_user_info ON public.application USING btree (user_id,status,updated_at,amount);
CodePudding user response:
It’s probably not going to use the index you want because of this line (which is not providing a value in the index)
AND app.updated_at is not null
Try altering your index to move this column to the end i.e.
CREATE INDEX Concurrently application_user_info ON public.application USING btree (user_id,status,amount,updated_at);
CodePudding user response:
The idea with a multicolumn index for this query would be to jump to the part of the index containing the data you need, using the first N columns of the index then read the index already ordered by the N 1 column.
The index you give won't work for this, because there is no one spot in the index where everything satisfying app.user_id = '123456*******' and app.status IN ('YES', 'NO')
is gathered. Rather there are two spots, one for YES and one for NO. In theory, PostgreSQL could look at those two spots and merge them back together in order with an Append Merge, but it won't do that the way your query is currently written. (Actually it couldn't do that at all, as the next column in the index is not the one I though it was--see later)
The easiest thing to try is to either take "status" out of the index, or move it to the end where it could still be used for Index-only-scans but doesn't interfere with reading in order (by created_at desc) once jumping to the place where all the app.user_id = '123456*******' reside.
So the index would be:
... ON public.application (user_id,created_at, status,updated_at,amount);
The last 3 columns are just there to provide easier filtering, they could be in any order or not there at all.