I'm working on the optimization of MySQL query these days, one of the issues I've encountered is DATE()
maybe not working for the table partitioned by date range.
Here is the sample table:
CREATE TABLE `testing_db` (
`date_time` date NOT NULL,
`id` varchar(10) NOT NULL,
PRIMARY KEY (`date_time`,`id`) USING BTREE,
UNIQUE KEY `unique` (`date_time`,`id`),
KEY `idx_date_time` (`date_time`),
KEY `idx_id` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
/*!50100 PARTITION BY RANGE (to_days(`date_time`))
(PARTITION p0 VALUES LESS THAN (TO_DAYS('2021-01-01')),
PARTITION p2021_01 VALUES LESS THAN (TO_DAYS('2021-01-31')),
PARTITION p2021_02 VALUES LESS THAN (TO_DAYS('2021-02-28')),
PARTITION future VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */;
Statement with DATE()
:
EXPLAIN
SELECT date_time, id FROM testing_db WHERE date_time = '2021-02-25';
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE testing_db p2021_02 ref PRIMARY,unique,idx_date_time,idx_id PRIMARY 3 const 1 100.00 Using index
Statement without DATE()
:
EXPLAIN
SELECT date_time, id FROM testing_db WHERE DATE(date_time) = '2021-02-25';
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE testing_db p0,p2021_01,p2021_02,future index idx_date_time 3 1 100.00 Using where; Using index
Comparing two explanations, obviously the statement with DATE()
scans all partitions while the statement without DATE()
doesn't. Its impact may be significant in a large table.
I've researched similar issues, but it seems they are not relevant to this case:
- Official Doc,
DATE()
extracts the date part of the date or datetime. - Mysql, partitioning not working on date range
- https://bugs.mysql.com/bug.php?id=28928
Could you help figure it out? Thanks a lot!
CodePudding user response:
The use of the DATE() function in your WHERE clause negates the use of any relevant index which causes a full table scan. The full table scan will need to read from all partitions.
In your example you are applying the DATE() function to a column of type DATE, so it serves no purpose.
CodePudding user response:
INDEX(date_time)
is unnecessary because there are two other indexes starting with that column.A
PRIMARY KEY
is (in MySQL) aUNIQUE
key. So yourUNIQUE(datetime, id)
is redundant.Usually is is unwise to start any index with the partition key (
date_time
).WHERE DATE(date_time) = ...
is not "sargable". That is, no indexing ofdate_time
can be used when hiding a column in a function (DATE()
). (This is the main problem that you are asking about.)Instead of using
DATE()
, use a range, such as:WHERE date_time >= '2021-02-26' AND date_time < '2021-02-26' INTERVAL 1 DAY
Based on the above comments, plus other things, just these two indexes would be better:
PRIMARY KEY(id, date_time), INDEX(date_time, id)
Please don't call it
date_time
when it is only aDATE
. My comments work for either datatype. TheDATE()
function is never needed around a column of datatypeDATE
nor a string that looks like a date.Your partition definitions put the last day of each month in the 'wrong' partition'.
Be aware that
PARTITIONing
rarely helps with performance. I discuss that further in Pagination