How to create index for multiple columns, equal and greater than?-CodePudding

I have a table, it has these columns:

 id       | tx_hash | tx_status | created_at
---------- --------- ----------- ---------------------------
15000000  | 0x0011  | 0         | 2021-07-30 06:42:00.267694
15000001  | 0x0011  | 0         | 2021-07-30 06:42:00.267694
15000002  | 0x0011  | 0         | 2021-07-30 06:42:00.267694
...
16000000  | 0x0011  | 0         | 2021-07-30 06:42:00.267694

I want to query by 2 columns: tx_status by equal and id by "greater than". My SQL query is:

select id, tx_hash, tx_status, created_at 
from pool_transaction_entries 
where tx_status = 0 and id > 15006000 
order by id desc 
limit 1;

and I found it very slow !

and this is the analysis result:

postgres=> explain analyze verbose select id, tx_hash, tx_status, created_at from pool_transaction_entries where tx_status = 0 and id > 15006000 order by id  desc limit 1;
                                                                                        QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.43..133.80 rows=1 width=87) (actual time=21415.241..21415.242 rows=0 loops=1)
   Output: id, tx_hash, tx_status, created_at
   ->  Index Scan Backward using pool_transaction_entries_pkey on public.pool_transaction_entries  (cost=0.43..3868.12 rows=29 width=87) (actual time=21415.238..21415.239 rows=0 loops=1)
         Output: id, tx_hash, tx_status, created_at
         Index Cond: (pool_transaction_entries.id > 15006000)
         Filter: (pool_transaction_entries.tx_status = 0)
         Rows Removed by Filter: 3556
 Query Identifier: 3330758434230110582
 Planning Time: 54.206 ms
 Execution Time: 21415.281 ms
(10 rows)

could you please give me a clue how to create index or optimize for this? thanks a lot!

CodePudding user response：

The best index for your purpose will be :

CREATE INDEX X 
   ON pool_transaction_entries (tx_status, id) 
   INCLUDE (tx_hash, created_at);

Because :

tx_status is seeked by =
id is seeked by >

So the index key must be in this exact order...

The INCLUDE clause is used to avoid a table access after the index seek. Ordinary indexes does not containes all the table's columns, only the key columns. INCLUDE says that some more columns are added to the leaf pages of the index to have a covering index. This technics has been added in Microsoft SQL Server since 2005 and recently added to PostGreSQL to mimics... Such technic is very efficient because only one object, the index itself why cover all the query clauses (SELECT clause, WHERE clause, GROUP BY CLause, HAVING clause, ORDER BY clause...) without any table access

CodePudding user response：

thanks @a_horse_with_no_name, this answer comes from him:

just create an index on (tx_status, id)