I'm running a couple tests on MySQL Clustered vs Non Clustered indexes where I have a table 100gb_table
which contains ~60 million rows:
100gb_table schema:
CREATE TABLE 100gb_table (
id int PRIMARY KEY NOT NULL AUTO_INCREMENT,
c1 int,
c2 text,
c3 text,
c4 blob NOT NULL,
c5 text,
c6 text,
ts timestamp NOT NULL default(CURRENT_TIMESTAMP)
);
and I'm executing a query that only reads the clustered index:
SELECT id FROM 100gb_table ORDER BY id;
I'm seeing that it takes almost an ~55 min for this query to complete which is strangely slow. I modified the table by adding another index on top of the Primary Key column and ran the following query which forces the non-clustered index to be used:
SELECT id FROM 100gb_table USE INDEX (non_clustered_key) ORDER BY id;
This finished in <10 minutes, much faster than reading with the clustered index. Why is there such a large discrepancy between these two? My understanding is that both indexes store the index column's values in a tree structure, except the clustered index contains table data in the leaf nodes so I would expect both queries to be similarly performant. Could the BLOB column possibly be distorting the clustered index structure?
CodePudding user response:
The answer comes in how the data is laid out.
The PRIMARY KEY
is "clustered" with the data; that is, the data is order ed by the PK in a B Tree structure. To read all of the ids
, the entire BTree must be read.
Any secondary index is also in a B Tree structure, but it contains (1) the columns of the index, and (2) any other columns in the PK.
In your example (with lots of [presumably] bulky columns), the data BTree is a lot bigger than the secondary index (on just id
). Either test probably required reading all the relevant blocks from the disk.
A side note... This is not as bad as it could be. There is a limit of about 8KB on how big a row can be. TEXT
and BLOB
columns, when short enough, are included in that 8KB. But when one is bulky, it is put in another place, leaving behind a 'pointer' to the text/blob. Hence, the main part of the data BTree is smaller than it might be if all the text/blob data were included directly.
Since SELECT id FROM tbl
is a mostly unnecessary query, the design of InnoDB does not worry about the inefficiency you discovered.
Tack on ORDER BY
or WHERE
, etc, and there are many different optimizations that could into play. You might even find that INDEX(c1)
will let your query run in not much more than 10 minutes. (I think I have given you all the clues for 'why'.)
Also, if you had done SELECT * FROM tbl
, it might have taken much longer than 55 minutes. This is because of having extra [random] fetches to get the texts/blobs from the "off-record" storage. And from the network time to shovel far more data.