I have an invoices
table like this:
| id | client_id | is_recurring |
|----- |-----------| -------------|
| 1 | 2121 | 0 |
| 2 | 7434 | 1 |
Now throughout my application, I might have following queries:
select * from invoices where client_id = 2121 and is_recurring = 0;
or
select * from invoices where is_recurring = 0 and client_id = 2121;
or any other order of where clause.
I already have index on client_id an is_recurring separately.
But for composite index, should I create composite index on
compost_index('client_id','is_recurring')
or
compost_index('is_recurring','client_id')
or Both?
Note that both are in different order. So for performance of different order searches? should I created composite index with multiple order/direction?
Update:
Also if I have a date
column which I will use for comparing greater or less or order by, what combinations of composite indexes I should use?
CodePudding user response:
As a rough rule of thumb, you might expect better performance in a two column index by placing the more restrictive (higher cardinality) column first. So, I would suggest:
CREATE INDEX ON invoices compost_index (client_id, is_recurring)
This index, if used, would let MySQL discard a large portion of the index just by filtering by the client_id
. The is_recurring
column, on the other hand, presumably only takes on the two values of 0 and 1. So, filtering by this might not allow for many records to be discarded when scanning the index.
CodePudding user response:
Either WHERE order;
Either INDEX order - regardless of cardinality; see proof
That is, either WHERE
is handled equally well by either INDEX
.
Neither single-column index - They may get in the way.
Meanwhile, the composite index also handles the corresponding single-column need. That is, INDEX(a,b)
takes care of cases where INDEX(a)
is needed, but not where INDEX(b)
is needed.
For this:
where client_id = 2121 and is_recurring = 0 and date > '2021-04-28';
a new rule is used:
INDEX(client_id, is_recurring, -- in either order
date) -- then the range test
That is, put all columns tested with =
(or IS NULL
) first; then you get one chance to add a range test.
where client_id = 2121 and date > '2021-04-28' -- (client_id, date)
where client_id = 2121 order by date > '2021-04-28'
-- also (client_id, date), but now the order is required
where client_id = 2121
and date > '2021-04-28'
order by date -- again (client_id, date), specific ordering
where client_id = 2121
and is_recurring = 0
and date > '2021-04-28'; -- back to the 3-column one above
An =
test is one thing; all inequality tests are "range".