I have a table, it has these columns:
id | tx_hash | tx_status | created_at
---------- --------- ----------- ---------------------------
15000000 | 0x0011 | 0 | 2021-07-30 06:42:00.267694
15000001 | 0x0011 | 0 | 2021-07-30 06:42:00.267694
15000002 | 0x0011 | 0 | 2021-07-30 06:42:00.267694
...
16000000 | 0x0011 | 0 | 2021-07-30 06:42:00.267694
I want to query by 2 columns: tx_status
by equal
and id
by "greater than". My SQL query is:
select id, tx_hash, tx_status, created_at
from pool_transaction_entries
where tx_status = 0 and id > 15006000
order by id desc
limit 1;
and I found it very slow !
and this is the analysis result:
postgres=> explain analyze verbose select id, tx_hash, tx_status, created_at from pool_transaction_entries where tx_status = 0 and id > 15006000 order by id desc limit 1;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.43..133.80 rows=1 width=87) (actual time=21415.241..21415.242 rows=0 loops=1)
Output: id, tx_hash, tx_status, created_at
-> Index Scan Backward using pool_transaction_entries_pkey on public.pool_transaction_entries (cost=0.43..3868.12 rows=29 width=87) (actual time=21415.238..21415.239 rows=0 loops=1)
Output: id, tx_hash, tx_status, created_at
Index Cond: (pool_transaction_entries.id > 15006000)
Filter: (pool_transaction_entries.tx_status = 0)
Rows Removed by Filter: 3556
Query Identifier: 3330758434230110582
Planning Time: 54.206 ms
Execution Time: 21415.281 ms
(10 rows)
could you please give me a clue how to create index or optimize for this? thanks a lot!
CodePudding user response:
The best index for your purpose will be :
CREATE INDEX X
ON pool_transaction_entries (tx_status, id)
INCLUDE (tx_hash, created_at);
Because :
- tx_status is seeked by =
- id is seeked by >
So the index key must be in this exact order...
The INCLUDE clause is used to avoid a table access after the index seek. Ordinary indexes does not containes all the table's columns, only the key columns. INCLUDE says that some more columns are added to the leaf pages of the index to have a covering index. This technics has been added in Microsoft SQL Server since 2005 and recently added to PostGreSQL to mimics... Such technic is very efficient because only one object, the index itself why cover all the query clauses (SELECT clause, WHERE clause, GROUP BY CLause, HAVING clause, ORDER BY clause...) without any table access
CodePudding user response:
thanks @a_horse_with_no_name, this answer comes from him:
just create an index on (tx_status, id)