About MySQL's Leftmost Prefix Matching Optimization-CodePudding

I now have a table like this：

> DESC userInfo;
 -------- --------------------- ------ ----- --------- ---------------- 
| Field  | Type                | Null | Key | Default | Extra          |
 -------- --------------------- ------ ----- --------- ---------------- 
| id     | int(11)             | NO   | PRI | NULL    | auto_increment |
| name   | char(32)            | NO   | MUL | NULL    |                |
| age    | tinyint(3) unsigned | NO   |     | NULL    |                |
| gender | tinyint(1)          | NO   |     | 1       |                |
 -------- --------------------- ------ ----- --------- ----------------

I made (name, age) a joint unique index:

> SHOW INDEX FROM userInfo;
 ---------- ------------ -------------------- -------------- ------------- ----------- ------------- ---------- -------- ------ ------------ --------- -------------------- 
| Table    | Non_unique | Key_name           | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment      |
 ---------- ------------ -------------------- -------------- ------------- ----------- ------------- ---------- -------- ------ ------------ --------- -------------------- 
| userInfo |          0 | PRIMARY            |            1 | id          | A         |           0 |     NULL | NULL   |      | BTREE      |         |                    |
| userInfo |          0 | joint_unique_index |            1 | name        | A         |           0 |     NULL | NULL   |      | BTREE      |         | 联合唯一索引       |
| userInfo |          0 | joint_unique_index |            2 | age         | A         |           0 |     NULL | NULL   |      | BTREE      |         | 联合唯一索引       |
 ---------- ------------ -------------------- -------------- ------------- ----------- ------------- ---------- -------- ------ ------------ --------- -------------------- 
3 rows in set (0.00 sec)

Now, when I use the following query statement, its type is All:

> DESC SELECT * FROM userInfo WHERE age = 18;
 ---- ------------- ---------- ------------ ------ --------------- ------ --------- ------ ------ ---------- ------------- 
| id | select_type | table    | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
 ---- ------------- ---------- ------------ ------ --------------- ------ --------- ------ ------ ---------- ------------- 
|  1 | SIMPLE      | userInfo | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    1 |   100.00 | Using where |
 ---- ------------- ---------- ------------ ------ --------------- ------ --------- ------ ------ ---------- -------------

I can understand this behavior, because according to the leftmost prefix matching feature, age will not be used as an index column when querying.

But when I use the following statement to query, its type is Index:

> DESC SELECT name, age FROM userInfo WHERE age = 18;
 ---- ------------- ---------- ------------ ------- --------------- -------------------- --------- ------ ------ ---------- -------------------------- 
| id | select_type | table    | partitions | type  | possible_keys | key                | key_len | ref  | rows | filtered | Extra                    |
 ---- ------------- ---------- ------------ ------- --------------- -------------------- --------- ------ ------ ---------- -------------------------- 
|  1 | SIMPLE      | userInfo | NULL       | index | NULL          | joint_unique_index | 132     | NULL |    1 |   100.00 | Using where; Using index |
 ---- ------------- ---------- ------------ ------- --------------- -------------------- --------- ------ ------ ---------- -------------------------- 
1 row in set, 1 warning (0.00 sec)

I can't understand how this result is produced. According to Example 1, the age as the query condition does not satisfy the leftmost prefix matching feature, but from the results, its type is actually Index! Is this an optimization in MySQL?

When I try to make sure I use indexed columns as query conditions, their type is always ref, as shown below:

> DESC SELECT * FROM userInfo WHERE name = "Jack";
 ---- ------------- ---------- ------------ ------ -------------------- -------------------- --------- ------- ------ ---------- ------- 
| id | select_type | table    | partitions | type | possible_keys      | key                | key_len | ref   | rows | filtered | Extra |
 ---- ------------- ---------- ------------ ------ -------------------- -------------------- --------- ------- ------ ---------- ------- 
|  1 | SIMPLE      | userInfo | NULL       | ref  | joint_unique_index | joint_unique_index | 128     | const |    1 |   100.00 | NULL  |
 ---- ------------- ---------- ------------ ------ -------------------- -------------------- --------- ------- ------ ---------- ------- 
1 row in set, 1 warning (0.00 sec)

> DESC SELECT name, age FROM userInfo WHERE name = "Jack";
 ---- ------------- ---------- ------------ ------ -------------------- -------------------- --------- ------- ------ ---------- ------------- 
| id | select_type | table    | partitions | type | possible_keys      | key                | key_len | ref   | rows | filtered | Extra       |
 ---- ------------- ---------- ------------ ------ -------------------- -------------------- --------- ------- ------ ---------- ------------- 
|  1 | SIMPLE      | userInfo | NULL       | ref  | joint_unique_index | joint_unique_index | 128     | const |    1 |   100.00 | Using index |
 ---- ------------- ---------- ------------ ------ -------------------- -------------------- --------- ------- ------ ---------- ------------- 
1 row in set, 1 warning (0.00 sec)

Please tell me why when I use age as a query, the first result is ALL, but the second result is INDEX. Is this the result of MySQL optimization? In other words, when SELECT * is used, index column queries are not applied, but when SELECT joint_col1, joint_col2 FROM joint_col2 are used, index column queries (because type is INDEX) are used. Why does this difference occur?

CodePudding user response：

If you do

SELECT * FROM userInfo WHERE age = 18;

MySQL has to read, as you expected, every row of the table. It cannot use the index on (name, age) to limit the rows read.

If you do

SELECT name, age FROM userInfo WHERE age = 18;

the situation doesn't change a lot: MySQL will also have to read every row, and still cannot use the index on (name, age) to limit the rows read.

But it can use a trick: since you only need the columns name and age, it can read all rows (entries of the index) from the index and still have all information it needs, as the index obviously stores name and age (and, for InnoDB, it does include id for technical reasons, so you'd get the same behaviour if you try SELECT ìd, name, age FROM userInfo WHERE age = 18).

Why would MySQL do that? Because it has to read less absolute data than reading the complete table: the index stores the information you want in less bytes (as it doesn't include gender). Reading less data to get all the information you need is better than reading more data to get the same information. So MySQL will do just that.

CodePudding user response：

The "leftmost" rule applies to the WHERE clause versus the INDEX.

INDEX(name, age) is useful for WHERE name = '...' or WHERE name = '...' AND ((anything else)) because name is leftmost in the index.

What you have is WHERE age = ... ((and nothing else)), so you need INDEX(age) (or INDEX(age, ...)).

In particular, SELECT name, age FROM userInfo WHERE age = 18;:

INDEX(age) -- good
INDEX(age, name) -- better because it is "covering".

The order of columns in the WHERE does not matter; the order in the INDEX does matter.