I have the following query. On MySQL 5.7 it takes over 30 mins minutes to run but on MySQL 8 it takes less than 10 seconds. There are around 20k records in students_t, 500 in classes_t and 1100 in groups_t. student_enrolled_classes_t contains 100288 records and 160156 records.There are no foreign keys in place.
SELECT DISTINCT
s.forename
, s.surname
, count(c.id) as 'classes'
, group_concat(DISTINCT g.name) 'groups'
FROM students_t s
, student_enrolled_classes_t c
, student_enrolled_groups_t eg
, groups_t g
WHERE s.id = c.student_id
AND s.id = eg.student_id
AND eg.group_id = g.id
GROUP BY s.id, c.id
ORDER BY s.forename ASC;
The table structures are (generated with Hibernate):
CREATE TABLE classes_t
(id BIGINT auto_increment PRIMARY KEY);
CREATE TABLE groups_t
(id BIGINT auto_increment PRIMARY KEY
,name VARCHAR(255) NULL);
CREATE TABLE student_enrolled_classes_t
(id BIGINT auto_increment PRIMARY KEY
,class_id BIGINT NULL
,student_id BIGINT NULL);
CREATE TABLE student_enrolled_groups_t
(id BIGINT NOT NULL PRIMARY KEY
,group_id BIGINT NULL
,student_id BIGINT NULL);
CREATE TABLE students_t
(id BIGINT auto_increment PRIMARY KEY
,forename VARCHAR(255) NULL
,surname VARCHAR(255) NULL);
The question is: Why does it take 30 mins on MySQL 5.7 but only 10 seconds on MySQL 8?
Thank you
CodePudding user response:
You could write the query as follows. It uses newer join syntax and takes into a account situations where student is not part of a class/group.
SELECT
s.forename
, s.surname
, count(distinct c.id) as 'classes'
, group_concat(distinct g.name) 'groups'
FROM students_t s
LEFT JOIN student_enrolled_classes_t c ON c.student_id=s.id
LEFT JOIN student_enrolled_groups_t eg ON eg.student_id=s.id
LEFT JOIN groups_t g ON eg.group_id = g.id
GROUP BY s.id
ORDER BY s.forename ASC;
CodePudding user response:
We need to see EXPLAIN SELECT ...
to analyze why one runs so slowly. While you are at it, also get EXPLAIN FORMAT=JSON SELECT ...
.
This reformulation and indexes may speed up the query on both servers, possibly making them both fast. Note that it gets rid of the GROUP BY
and DISTINCT
, both of which slow down the query.
SELECT s.forename, s.surname,
( SELECT COUNT(*)
FROM student_enrolled_classes_t
WHERE student_id = s.id
) AS 'classes',
( SELECT GROUP_CONCAT(g.name)
FROM student_enrolled_groups_t AS eg
JOIN groups_t AS g ON eg.group_id = g.id
WHERE eg.student_id = s.id
) AS 'groups'
FROM student_t AS s
ORDER BY s.forename ASC;
It seems that student_enrolled_groups_t
is a many-to-many mapping table. In that case, get rid of id
and have these two indexes:
PRIMARY KEY(student_id, group_id)
INDEX(group_id, student_id)
When writing JOINs
, please use the newer syntax involving ON
-- for saying how the tables are 'related'. Filtering is done in WHERE
.