Home > other >  How do composite indexes work and under what circumstances will they fail?
How do composite indexes work and under what circumstances will they fail?

Time:12-15

When building the index, I have some questions about the composite indexes.

Here is my SHOW CREATE TABLE

CREATE TABLE `sys_alarm`  (
  `id` bigint(20) UNSIGNED NOT NULL AUTO_INCREMENT,
  `dept_id` bigint(20) NULL DEFAULT NULL,
  `device_code` varchar(80) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL,
  `type` char(1) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL,
  `attr_name` varchar(50) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL,
  `attr_value` varchar(10) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL,
  `content` varchar(200) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL,
  `create_time` timestamp(0) NOT NULL DEFAULT CURRENT_TIMESTAMP(0),
  PRIMARY KEY (`alarm_id`) USING BTREE,
  INDEX `search`(`dept_id`, `device_code`, `attr_name`, `create_time`) USING BTREE
) ENGINE = InnoDB AUTO_INCREMENT = 16610 CHARACTER SET = utf8mb4 COLLATE = utf8mb4_general_ci ROW_FORMAT = Compact;

When I execute the following statement, the result I get confused me:

EXPLAIN
select a.alarm_id, a.dept_id, a.device_code, a.type, a.attr_name, a.attr_value, a.content, a.create_time
FROM sys_alarm a
WHERE a.dept_id = 214 
and a.create_time > "2021-11-06 15:00:17"

EXPLAIN
select a.alarm_id, a.dept_id, a.device_code, a.type, a.attr_name, a.attr_value, a.content, a.create_time
FROM sys_alarm a
WHERE a.create_time > "2021-11-06 15:00:17"
and a.dept_id = 214 

and it result:

id  select_type table  partitions   type    possible_keys   key     key_len ref    rows  filtered    Extra
1   SIMPLE      a       null        ref     search          search  9       const   1    33.33 Using index condition

I read this sentence on the MySQL documentation:

MySQL cannot use the index to perform lookups if the columns do not form a leftmost prefix of the index. Suppose that you have the SELECT statements shown here

so,why does the composite indexes still work? Only after I remove the dept_id query condition, the index will fail.What are the conditions for the composite indexes failure?

CodePudding user response:

MySQL will only use the "leftmost prefix" means that it will only use the dept_id portion of your index as it cannot skip to create_time. This is illustrated by the fact that key_len is 9 in the EXPLAIN output. I am a little confused as I would expect it to be 8 (the byte length of your BIGINT). There's a hole in my understanding here. Hopefully someone will explain this in the comments.

If you run your EXPLAIN with and without the create_time condition you should see the same result -

EXPLAIN SELECT `a`.`alarm_id`, `a`.`dept_id`, `a`.`device_code`, `a`.`type`, `a`.`attr_name`, `a`.`attr_value`, `a`.`content`, `a`.`create_time`
FROM `sys_alarm` `a`
WHERE `a`.`dept_id` = 214;

EXPLAIN SELECT `a`.`alarm_id`, `a`.`dept_id`, `a`.`device_code`, `a`.`type`, `a`.`attr_name`, `a`.`attr_value`, `a`.`content`, `a`.`create_time`
FROM `sys_alarm` `a`
WHERE `a`.`dept_id` = 214
AND `a`.`create_time` > '2021-11-06 15:00:17';
# id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE a ref search search 9 const 402 100.00

If you modify your query to add a condition for device_code you should see the key_len size change to 332, I'll come back to this later on.

EXPLAIN
SELECT `a`.`alarm_id`, `a`.`dept_id`, `a`.`device_code`, `a`.`type`, `a`.`attr_name`, `a`.`attr_value`, `a`.`content`, `a`.`create_time`
FROM `sys_alarm` `a`
WHERE `a`.`dept_id` = 214
AND `a`.`device_code` = 'ABCDE';
# id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE a ref search search 332 const const 1 100.00

And add attr_name -

EXPLAIN
SELECT `a`.`alarm_id`, `a`.`dept_id`, `a`.`device_code`, `a`.`type`, `a`.`attr_name`, `a`.`attr_value`, `a`.`content`, `a`.`create_time`
FROM `sys_alarm` `a`
WHERE `a`.`dept_id` = 214
AND `a`.`device_code` = 'ABCDE'
AND `a`.`attr_name` = 'dsgf';
# id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE a ref search search 535 const const const 1 100.00

And finally adding back create_time -

EXPLAIN
SELECT `a`.`alarm_id`, `a`.`dept_id`, `a`.`device_code`, `a`.`type`, `a`.`attr_name`, `a`.`attr_value`, `a`.`content`, `a`.`create_time`
FROM `sys_alarm` `a`
WHERE `a`.`dept_id` = 214
AND `a`.`device_code` = 'ABCDE'
AND `a`.`attr_name` = 'dsgf'
AND `a`.`create_time` > '2021-11-06 15:00:17';
# id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE a range search search 539 1 100.00 Using index condition

Now coming back to the key_len change from 9 to 332. When only using the dept_id it is just the BIGINT. As soon as you add device_code it jumps to 332. I am not 100% sure why 332 but it is something along the lines of - 80 (varchar length) * 4 (bytes per char due to multibyte charset) = 320.

The use of ROW_FORMAT = Compact would suggest that you are conscious of data size yet you are using a BIGINT (8 bytes) to store dept_id. Do you really need 9,223,372,036,854,775,807 departments?

CodePudding user response:

The order of clauses in WHERE does not matter. The order of columns in an INDEX does matter.

Both of your queries and the first two by nnichols would benefit from INDEX(dept_id, create_time) in that order.

  • dept_id is tested by =, so it should come first.
  • create_time is tested by a 'range', so it should be last.

I discuss more details in http://mysql.rjweb.org/doc.php/index_cookbook_mysql

nnichols goes on to have multiple = clauses, plus a > test. Tge columns tested with = should come first in a composite INDEX, in any order, then followed by create_time.

Key_len and cardinality do not matter in the choice of how to arrange the columns in an index.

I agree that BIGINT is rarely needed. BIGINT NOT NULL shows up in Explain as 8; BIGINT NULL shows up as 9. That is, the mysterious 1 is because of being NULLable. (It is not a big deal, just confusing.) Meanwhile, a DATETIME will show up as 9 or 10 bytes, depending on NOT NULL.

Another note: If you have both INDEX(dept_id) and INDEX(dept_id, create_time), the optimizer may pick the former even though the latter is clearly better. So drop the former index and keep the latter.

  • Related