I have the following table, with ~4 million rows:
CREATE TABLE members (
id INTEGER PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP NOT NULL,
updated_at TIMESTAMP WITH TIME ZONE,
-- other columns...
);
I use this following query to extract the latest updated rows:
SELECT *
FROM members
WHERE COALESCE(updated_at, created_at) > current_timestamp - interval '24 hours'
This query is obviously slow, so I created an index, but it is not used by Postgres:
CREATE INDEX members_updated_or_created_at ON members(COALESCE(updated_at, created_at));
Here's the execution plan:
Seq Scan on members (cost=0.00..171792.01 rows=1326991 width=1826) (actual time=62.663..22064.805 rows=1 loops=1)
Filter: (COALESCE(updated_at, created_at) > (CURRENT_TIMESTAMP - '48:00:00'::interval))
Rows Removed by Filter: 3980971
Planning Time: 0.123 ms
JIT:
Functions: 2
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 7.481 ms, Inlining 0.000 ms, Optimization 8.067 ms, Emission 35.308 ms, Total 50.857 ms
Execution Time: 22072.906 ms
I don't understand why it's doing a table scan instead of using an index scan. I also tried to select fewer fields, and adding a limit, but it didn't change anything.
EDIT:
So it seems like the index is not used because I'm fetching many columns that are not present in the index (select *).
I tried to do the same with the updated_at
column, and this time, the index is used if the only column I select is the "updated_at" column (Index Only Scan), it is not used if I include another column though.
What I don't understand, is why don't I get the same behavior with the coalesce function?
This query results in a full table scan
SELECT coalesce(updated_at, created_at)
FROM members
WHERE coalesce(updated_at, created_at) > current_timestamp - interval '7 days';
This query results in an Index Only Scan (index on updated_at)
SELECT updated_at
FROM members
WHERE updated_at > current_timestamp - interval '7 days';
CodePudding user response:
I found the solution, in order to force the DB to use my index, I added an "ORDER BY" clause, and it seems to work:
SELECT *
FROM members
WHERE coalesce(updated_at, created_at) > current_timestamp - interval '7 days'
ORDER BY coalesce(updated_at, created_at) DESC;
Index Scan Backward using members_updated_or_created_at on members (cost=0.43..446282.78 rows=1326991 width=1834) (actual time=8.367..8.369 rows=1 loops=1)
Index Cond: (COALESCE(updated_at, created_at) > (CURRENT_TIMESTAMP - '7 days'::interval))
Planning Time: 0.261 ms
JIT:
Functions: 5
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 2.065 ms, Inlining 0.000 ms, Optimization 0.825 ms, Emission 7.484 ms, Total 10.375 ms
Execution Time: 10.524 ms