I am using MySQL 5.6 and try to optimize next query:
SELECT t1.field1,
...
t1.field30,
t2.field1
FROM Table1 t1
JOIN Table2 t2 ON t1.fk_int = t2.pk_int
WHERE t1.int_field = ?
AND t1.enum_filed != 'value'
ORDER BY t1.created_datetime desc;
A response can contain millions of records and every row consists of 31 columns.
Now EXPLAIN says in Extra that planner uses 'Using where'.
I tried to add next index:
create index test_idx ON Table1 (int_field, enum_filed, created_datetime, fk_int);
After that EXPLAIN says in Extra that planner uses "Using index condition; Using filesort"
"rows" value from EXPLAIN with index is less than without it. But in practice time of execution is longer.
So, the questions are next:
- What is the best index for this query?
- Why EXPLAIN says that 'key_len' of query with index is 5. Shouldn't it be 4 1 8 4=17?
- Should the fields from ORDER BY be in index?
- Should the fields from JOIN be in index?
CodePudding user response:
try refactor your index this way
avoid (o move to the right after fk_int) the created_datetime column.. and move fk_int before the enum_filed column .. the in this wahy the 3 more colums used for filter shold be use better )
create index test_idx ON Table1 (int_field, fk_int, enum_filed);
be sure you have also an specific index on table2 column pk_int. if you have not add
create index test_idx ON Table2 (int_field, fk_int, enum_filed);
CodePudding user response:
My guess for your question no. 1:
create index my_idx on Table1(int_field, created_datetime desc, fk_int)
or one of these (but neither will probably be worthwhile):
create index my_idx on Table1(int_field, created_datetime desc, enum_filed, fk_int)
create index my_idx on Table1(int_field, created_datetime desc, fk_int, enum_filed)
I'm supposing 3 things:
Table2.pk_int
is already a primary key, judging by the name- The
where
condition onTable1.int_field
applies to a small subset ofTable1
- The inequality on
Table1.enum_filed
(I would fix the typo, if I were you) only excludes a small subset ofTable1
I can't answer question no. 2.
As for questions 3 and 4: If, as I suppose, it's more efficient to start the query plan from the where
condition on Table1.int_field
, the composite index, in this case also with the correct sort order (desc
), enables a scan of the index to get the output rows in the correct order, without an extra sort step. Furthermore, adding also fk_int
to the index makes the retrieval of any record of Table1
unnecessary unless a corresponding record has been found in Table2
. For a similar reason you could also add enum_filed
to the index, but, if this doesn't considerably reduce the output record count, the increase in index size will make things worse instead of better. In the end, you will have to try it out (with realistic data!).
Note that if you put another column between int_field
and created_datetime
in the index, you won't get the created_datetime
in the desired output order.