I have a very large table (that is still growing) (around 90GB with ~350mil rows). It is a table that include sales of items, if you were wondering.
In this table, there's (for example), column A,B,C,D,E,F,G. Currently, I am using a combined index, consisting of (A,B,C,D,E,F).
Usually, the query will consist of A,B,C,D,E. F is included occasionally (hence the indexing). eg,
SELECT * FROM table WHERE A = ? AND B = ? AND C = ? AND D = ? AND E = ?;
Sometimes, with the addon of AND F = ?;
But on certain occasion, the query will consist of A,B,C,D,G (whereby G is not indexed (not combined nor single indexed).
This causes timeout on certain occasion as the data is quite big.
So my question is, in order to solve this issue in terms of indexing,
should I
Option 1: add G into the combined index, making it become (A,B,C,D,E,F,G).
- Does this even work when I query A,B,C,D,G (missing E & F)?
Option 2: add G as a single index.
- Based on what i know, this does not work, as my query has A,B,C,D,G. The first combined index will be used instead (correct me if I'm wrong).
Option 3: Go with option 1, combine all the columns, but I change my query instead, to always query A,B,C,D,E,F,G even when F is not needed.
eg,
SELECT * FROM table WHERE A = ? AND B = ? AND C = ? AND D = ? AND E = ? AND F IS NOT NULL AND G = ?;
Thanks
CodePudding user response:
Option 1 - Yes, this will work. The server will perform index seek by (A,B,C,D,E)
and furter index scan by (G)
.
Option 2 - Makes no sense in most cases, server uses only one index for one source table copy. But when the selectivity of single index by (G)
is higher than one for (A,B,C,D,E)
combination then the server will use this single-column index.
Option 3 - The processing is equal to one in Option 2.
CodePudding user response:
Are the PRIMARY KEY's
column(s) included in A..E ? If so, none of the indexes are needed.
What datatypes are involved?
Are they all really tests on =
? If not then 'all bets are off'. More specifically, useful indexes necessarily start with the columns tested with =
(in any order). In particular, F IS NOT NULL
is not =
(but IS NULL
would count as =
).
I would expect INDEX(A,B,C,D,E, anything else or nothing else)
to work for all of the queries you listed. (Hence, I suspect there are some details missing from your over-simplified description.)
How "selective" are F and G? For example, if most of the values of G
are distinct, then INDEX(G)
would possibly be useful by itself.
Please provide SHOW CREATE TABLE
and EXPLAIN SELECT ...