I have the following MySQL table:
CREATE TABLE example (
id INT AUTO_INCREMENT NOT NULL,
node0 SMALLINT DEFAULT 0 NOT NULL,
node1 SMALLINT DEFAULT 0 NOT NULL,
node2 SMALLINT DEFAULT 0 NOT NULL,
node3 SMALLINT DEFAULT 0 NOT NULL,
version SMALLINT DEFAULT 1 NOT NULL,
user_id INT DEFAULT NULL,
score INT DEFAULT 0 NOT NULL,
datetime BIGINT NOT NULL,
INDEX IDX_22B017FAA76ED395 (user_id), PRIMARY KEY(id)
)
ALTER TABLE example ADD CONSTRAINT FK_22B017FAA76ED395 FOREIGN KEY (user_id) REFERENCES user (id);
The table will be quite large(-ish) (a few millions entries), and I was wondering what was the optimal way of indexing the columns for performance.
The table will mainly be queried to get all the rows matching:
- a node0 value
- a node1 value
- a node2 value
- a node3 value
- a version value
... with a Join statement to get the associated users (user_id field)
These fields (node0, node1, node2, node3, version) can form a key (no two rows can have the same values for all these fields), but I would like to keep the ID as an auto-incremental key so I can order them.
Should I add a secondary key for these fields? Or make them the primary key and have the ID as a secondary key? And/or add individual indices?
CodePudding user response:
You probably want multicolumn indexes matching your most common query patterns. I guess you said your query pattern is this.
WHERE node0 = #constant#
AND node1 = #constant#
AND node2 = #constant#
AND node3 = #constant#
AND version = #constant#
This multicolumn index will help.
INDEX example_nodes_version (node0, node1, node2, node3, version)
It will not help if you don't filter on those conditions in order. For example it will help this
WHERE node0 = #constant#
AND node1 = #constant#
but not this
WHERE node2 = #constant#
AND version = #constant#
Here's an even faster way to handle indexing if the vast majority of your queries follow the first query pattern.
You can make your primary key the concatenation of those columns, and create a UNIQUE key to handle your autoincrementing ID.
CREATE TABLE example (
id INT AUTO_INCREMENT NOT NULL,
node0 SMALLINT DEFAULT 0 NOT NULL,
node1 SMALLINT DEFAULT 0 NOT NULL,
node2 SMALLINT DEFAULT 0 NOT NULL,
node3 SMALLINT DEFAULT 0 NOT NULL,
version SMALLINT DEFAULT 1 NOT NULL,
user_id INT DEFAULT NULL,
score INT DEFAULT 0 NOT NULL,
datetime BIGINT NOT NULL,
INDEX IDX_22B017FAA76ED395 (user_id),
PRIMARY KEY(node0, node1, node2, node3, version),
UNIQUE INDEX ex_autoincrement (id)
)
Why is this helpful? InnoDB uses a clustered index storage scheme. That is, it stores all the data as part of its primary key. So, lookups according to the filtering above will directly access your data rather than having to search the index, then search a separate table.
Unless you project this table to be large (a million rows or more) and heavily queried, you should keep it simple: create indexes to match your query patterns and use the id
as the primary key.
And know this: as applications and their tables grow, often new indexes are required to match real-world query patterns. It's easy to add new indexes based on experience.