I have worked my way around many challenges with MySQL, and i think right now i am able to build everything that i need, to get something to work. But now, for a pretty huge SQL statement that returns a lot of data, i need to work on MySQL performance for the first time.
I was hoping someone here could help me find out why the following statement is so incredibly slow. It takes over 3 minutes to collect 740 results out of different tables. The biggest table beeing the "reports" table, consisting of somewhere over 20.000 entries at the moment.
I can also educate myself if someone could just point me in the right direction. I don't even know where to search for answers for my current problem.
Okay, so here is the statement that i am talking about. Maybe, if someone has enough experience with SQL performance, something just right away jumps at them. I would be happy for any kind of feedback. I'll elaborate on the statement right after the code itself:
SELECT
R_ID,
R_From,
R_To,
SUM(UR_TotalTime) AS UR_TotalTime,
R_Reported,
U_ID,
U_Lastname,
U_Firstname,
C_ID,
C_Lastname,
C_Firstname,
R_Breaks,
MAX(CR_BID) AS CR_BID,
R_Type,
R_Distance,
R_AdditionalDistance,
R_Activities,
R_Description,
R_Signature,
CT_SigReq,
MAX(I_LastIntegration) AS I_LastIntegration
FROM
reports
LEFT JOIN
userreports ON R_ID = UR_RID
LEFT JOIN
users ON R_UID = U_ID
LEFT JOIN
customers ON R_CID = C_ID
LEFT JOIN
customerterms ON CT_CID = R_CID
LEFT JOIN
integration ON R_UID = I_UID
LEFT JOIN
customerreports ON R_ID = CR_RID
WHERE
(CAST(R_From AS DATE) BETWEEN CT_From AND CT_To
OR R_CID = 0)
AND ((R_From BETWEEN '2021-02-01 00.00.00' AND '2021-02-28 23.59.59')
OR (R_To BETWEEN '2021-02-01 00.00.00' AND '2021-02-28 23.59.59')
OR (R_From <= '2021-02-01 00.00.00'
AND R_To >= '2021-02-28 23.59.59'))
GROUP BY R_ID
ORDER BY R_From ASC
So what i have here is the following: reports (R_*) - This is the main table that is queried. I need some of it's data, but it's also the filter, since i only need results between specific timestamps.
CREATE TABLE `reports` (
`R_ID` int(100) NOT NULL AUTO_INCREMENT,
`R_Type` varchar(1) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`R_UID` int(6) NOT NULL,
`R_CID` int(10) NOT NULL,
`R_From` datetime(0) NOT NULL,
`R_To` datetime(0) NOT NULL,
`R_Traveltime` int(11) NOT NULL,
`R_Breaks` int(11) NOT NULL DEFAULT 0,
`R_PayoutFlextime` decimal(20, 2) NOT NULL DEFAULT 0.00,
`R_Distance` int(11) NOT NULL DEFAULT 0,
`R_AdditionalDistance` int(11) NOT NULL DEFAULT 0,
`R_Activities` text CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`R_Description` text CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`R_Signature` varchar(50) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL DEFAULT '0',
`R_SignatureDate` datetime(0) DEFAULT NULL,
`R_Reported` datetime(0) NOT NULL,
`R_Status` varchar(10) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL DEFAULT 'New',
`R_EditedBy` int(11) DEFAULT NULL,
`R_EditedDateTime` datetime(0) DEFAULT NULL,
PRIMARY KEY (`R_ID`) USING BTREE
) ENGINE = InnoDB AUTO_INCREMENT = 1 CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Compact;
userreports (UR_*) - Delivers some data that is calculated from the sourcedata in reports
CREATE TABLE `userreports` (
`UR_ID` int(11) NOT NULL AUTO_INCREMENT,
`UR_RID` int(100) NOT NULL,
`UR_UID` int(6) NOT NULL,
`UR_Date` date NOT NULL,
`UR_From` time(0) NOT NULL,
`UR_To` time(0) NOT NULL,
`UR_ReportedTime` decimal(20, 5) DEFAULT NULL,
`UR_ReportedTravel` decimal(20, 5) NOT NULL,
`UR_ReportedBreaks` decimal(20, 5) DEFAULT NULL,
`UR_TotalPercentageSurcharge` decimal(20, 2) DEFAULT NULL,
`UR_TotalTime` decimal(20, 5) DEFAULT NULL,
`UR_PercentageSurchargeTypes` varchar(50) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
`UR_Distance` decimal(20, 2) DEFAULT NULL,
`UR_AdditionalDistance` decimal(20, 2) DEFAULT NULL,
`UR_TravelCompensation` decimal(20, 2) DEFAULT NULL,
PRIMARY KEY (`UR_ID`) USING BTREE
) ENGINE = InnoDB AUTO_INCREMENT = 1 CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Compact;
customerreports (CR_*) - Same as userreports, but with calculated data from the customers perspective
CREATE TABLE `customerreports` (
`CR_ID` int(11) NOT NULL AUTO_INCREMENT,
`CR_RID` int(100) NOT NULL,
`CR_CID` int(6) NOT NULL,
`CR_Date` date NOT NULL,
`CR_From` time(0) NOT NULL,
`CR_To` time(0) NOT NULL,
`CR_ReportedTime` decimal(20, 2) DEFAULT NULL,
`CR_ReportedBreaks` decimal(20, 2) DEFAULT NULL,
`CR_Hourly` decimal(20, 2) DEFAULT NULL,
`CR_Salary` decimal(20, 2) DEFAULT NULL,
`CR_TotalPercentageSurcharge` decimal(20, 2) DEFAULT NULL,
`CR_TotalFixedSurcharge` decimal(20, 2) DEFAULT NULL,
`CR_TotalTime` decimal(20, 2) DEFAULT NULL,
`CR_TotalSalary` decimal(20, 2) DEFAULT NULL,
`CR_FixedSurchargeTypes` varchar(50) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
`CR_PercentageSurchargeTypes` varchar(50) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
`CR_Distance` decimal(20, 2) DEFAULT NULL,
`CR_AdditionalDistance` decimal(20, 2) DEFAULT NULL,
`CR_TravelCompensation` decimal(20, 2) DEFAULT NULL,
`CR_BID` int(11) NOT NULL DEFAULT 0,
PRIMARY KEY (`CR_ID`) USING BTREE
) ENGINE = InnoDB AUTO_INCREMENT = 1 CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Compact;
users (U_*) - Obviously delivers Data to the user that created the report, e.g. name,...
CREATE TABLE `users` (
`U_ID` int(6) NOT NULL AUTO_INCREMENT,
`U_PW` text CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`U_PWInitial` tinyint(1) NOT NULL,
`U_FailedAttempts` int(1) NOT NULL,
`U_Email` text CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`U_Title` text CHARACTER SET utf8 COLLATE utf8_general_ci,
`U_Firstname` text CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`U_Lastname` text CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`U_ETC` varchar(60) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`U_Street` text CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`U_Housenumber` text CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`U_Code` varchar(5) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`U_City` text CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`U_Phone` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`U_Mobile` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`U_Birthdate` date NOT NULL,
`U_Sex` varchar(1) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`U_Maritalstatus` varchar(2) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`U_Severelydisabled` tinyint(1) NOT NULL,
`U_Severelydisabledspecify` int(3) NOT NULL,
`U_Citizenship` varchar(50) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`U_Education` varchar(70) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`U_Vocationaltraining` varchar(70) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`U_CLID` tinyint(1) NOT NULL,
`U_CLSpecify` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`U_IBAN` varchar(27) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`U_BIC` varchar(11) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`U_INID` int(11) DEFAULT NULL,
`U_Insurancenumber` text CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`U_Insurancetype` varchar(8) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`U_Taxidentificationnumber` text CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`U_Confession` varchar(50) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`U_Entry` date NOT NULL,
`U_TEntry` date NOT NULL,
`U_Exit` date NOT NULL DEFAULT '9999-12-31',
`U_Hourscarryover` decimal(20, 2) NOT NULL,
`U_TotalHolidayCarryover` int(11) NOT NULL DEFAULT 0,
`U_UsedHolidayCarryover` int(11) NOT NULL DEFAULT 0,
`U_SIN` varchar(30) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`U_RVBDone` tinyint(1) NOT NULL DEFAULT 0,
`U_ClosedMonth` date NOT NULL DEFAULT '1970-01-01',
`U_DeleteDate` date DEFAULT NULL,
PRIMARY KEY (`U_ID`) USING BTREE
) ENGINE = InnoDB AUTO_INCREMENT = 2 CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Compact;
customers (C_*) - Same as users, but for the data of the customer that the user worked on
CREATE TABLE `customers` (
`C_ID` int(10) NOT NULL AUTO_INCREMENT,
`C_MID` text CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`C_Active` tinyint(1) NOT NULL,
`C_Email` text CHARACTER SET utf8 COLLATE utf8_general_ci,
`C_Title` text CHARACTER SET utf8 COLLATE utf8_general_ci,
`C_Firstname` text CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`C_Lastname` text CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`C_Birthdate` date NOT NULL,
`C_ETC` varchar(60) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`C_Street` text CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`C_Housenumber` text CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`C_Code` varchar(5) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`C_City` text CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`C_Phone` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
`C_Mobile` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
`C_IBAN` varchar(27) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`C_BIC` varchar(11) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`C_Insurancenumber` varchar(30) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
`C_INID` int(11) DEFAULT NULL,
`C_Insurancetype` varchar(8) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
`C_Sex` varchar(1) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
`C_Contact1` text CHARACTER SET utf8 COLLATE utf8_general_ci,
`C_Contact2` text CHARACTER SET utf8 COLLATE utf8_general_ci,
`C_ContactChoice` int(1) DEFAULT 0,
`C_DeleteDate` date DEFAULT NULL,
`C_DeactivationDate` date DEFAULT NULL,
`C_CreationDate` date DEFAULT NULL,
`C_DeceasedDate` date DEFAULT NULL,
PRIMARY KEY (`C_ID`) USING BTREE
) ENGINE = InnoDB AUTO_INCREMENT = 1 CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Compact;
integration (I_*) - Provides data on whether or not the report is already integrated (and can no longer be changed)
CREATE TABLE `integration` (
`I_ID` int(11) NOT NULL AUTO_INCREMENT,
`I_UID` int(11) NOT NULL,
`I_LastIntegration` date NOT NULL DEFAULT '1970-01-01',
`I_SumFlextime` decimal(20, 2) NOT NULL DEFAULT 0.00,
`I_OldHolidays` int(5) NOT NULL DEFAULT 0,
PRIMARY KEY (`I_ID`) USING BTREE
) ENGINE = InnoDB AUTO_INCREMENT = 1 CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Compact;
customerterms (CT_*) - In this case only provides if the specified customer needs to sign the report
CREATE TABLE `customerterms` (
`CT_ID` int(50) NOT NULL AUTO_INCREMENT,
`CT_CID` int(10) NOT NULL,
`CT_From` date NOT NULL,
`CT_To` date NOT NULL,
`CT_Hourly` decimal(20, 2) NOT NULL,
`CT_FixedTravelCompensation` decimal(20, 2) NOT NULL,
`CT_PerKMCompensationBase` decimal(20, 2) NOT NULL,
`CT_PerKMCompensationAdditional` decimal(20, 2) NOT NULL,
`CT_MaxTravelCompensationReport` decimal(20, 2) DEFAULT NULL,
`CT_MaxTravelCompensationMonthly` decimal(20, 2) DEFAULT NULL,
`CT_FixedSaturdaySurcharge` decimal(20, 2) NOT NULL DEFAULT 0.00,
`CT_PercentageSaturdaySurcharge` decimal(20, 2) NOT NULL DEFAULT 1.00,
`CT_FixedSundaySurcharge` decimal(20, 2) NOT NULL DEFAULT 0.00,
`CT_PercentageSundaySurcharge` decimal(20, 2) NOT NULL DEFAULT 1.00,
`CT_FixedHolidaySurcharge` decimal(20, 2) NOT NULL DEFAULT 0.00,
`CT_PercentageHolidaySurcharge` decimal(20, 2) NOT NULL DEFAULT 1.00,
`CT_SigReq` int(1) NOT NULL,
`CT_NighttimeFrom` time(0) NOT NULL DEFAULT '00:00:00',
`CT_NighttimeTo` time(0) NOT NULL DEFAULT '00:00:00',
`CT_FixedNighttimeSurcharge` decimal(20, 2) NOT NULL DEFAULT 0.00,
`CT_PercentageNighttimeSurcharge` decimal(20, 2) NOT NULL DEFAULT 1.00,
`CT_StackingSurcharge` tinyint(1) NOT NULL DEFAULT 0,
`CT_MinimumTime` int(11) NOT NULL DEFAULT 1,
`CT_TimeIncrement` int(11) NOT NULL DEFAULT 1,
PRIMARY KEY (`CT_ID`) USING BTREE
) ENGINE = InnoDB AUTO_INCREMENT = 1 CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Compact;
The server is running MySQL 5.7, has 4 processors at 4,6Ghz, and 16GB of RAM available.
Since this is a hobby project, that i am supporting small care-businesses with, to allow them easier management of their daily tasks, i can change everything here. Code, Database Layout, you name it. As long as the poor people in the office don't have to wait for 5 minutes, just to sometimes even only get a timeout...
I'll add the result of EXPLAIN as image, since i can't get it to look good otherwise...
───── ────────────── ────────────────── ───────────── ───────── ────────────────────── ────────── ────────── ────────────────────────── ─────── ─────────── ─────────────────────────────────────────────────────
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
───── ────────────── ────────────────── ───────────── ───────── ────────────────────── ────────── ────────── ────────────────────────── ─────── ─────────── ─────────────────────────────────────────────────────
| 1 | SIMPLE | reports | NULL | ALL | PRIMARY,R_From,R_To | NULL | NULL | NULL | 22249 | 29.76 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | userreports | NULL | ALL | NULL | NULL | NULL | NULL | 21359 | 100.00 | Using where; Using join buffer (Block Nested Loop) |
| 1 | SIMPLE | users | NULL | eq_ref | PRIMARY | PRIMARY | 4 | dbs671769.reports.R_UID | 1 | 100.00 | NULL |
| 1 | SIMPLE | customers | NULL | eq_ref | PRIMARY | PRIMARY | 4 | dbs671769.reports.R_CID | 1 | 100.00 | NULL |
| 1 | SIMPLE | customerterms | NULL | ALL | NULL | NULL | NULL | NULL | 1429 | 100.00 | Using where; Using join buffer (Block Nested Loop) |
| 1 | SIMPLE | integration | NULL | ALL | NULL | NULL | NULL | NULL | 1134 | 100.00 | Using where; Using join buffer (Block Nested Loop) |
| 1 | SIMPLE | customerreports | NULL | ALL | NULL | NULL | NULL | NULL | 9078 | 100.00 | Using where; Using join buffer (Block Nested Loop) |
───── ────────────── ────────────────── ───────────── ───────── ────────────────────── ────────── ────────── ────────────────────────── ─────── ─────────── ─────────────────────────────────────────────────────
Is there any way to consolidate all this data faster, but as reliable?
Thanks a lot in advance, for any help or idea on this.
CodePudding user response:
Let's start by adding an index for each of the foreign keys used in your query -
ALTER TABLE `userreports`
ADD INDEX `FK_UR_RID` (`UR_RID`);
ALTER TABLE `customerterms`
ADD INDEX `FK_CT_CID` (`CT_CID`);
ALTER TABLE `integration`
ADD INDEX `FK_I_UID` (`I_UID`);
ALTER TABLE `customerreports`
ADD INDEX `FK_CR_RID` (`CR_RID`);
Please add these indices and then add the updated EXPLAIN output plus the result of the following query to your question.
-- this just retrieves some basic stats about size of each table used in your query
SELECT TABLE_NAME, ENGINE, VERSION, TABLE_ROWS, AVG_ROW_LENGTH, DATA_LENGTH, INDEX_LENGTH
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = 'dbs671769'
AND TABLE_NAME IN('customerreports', 'customers', 'customerterms', 'integration', 'reports', 'userreports', 'users');
CodePudding user response:
WHERE (CAST(R_From AS DATE) BETWEEN CT_From AND CT_To
OR R_CID = 0
)
OR
is sluggish. Is there some way to get rid of R_CID = 0
? If not, we can talk about UNION
.
Rewrite the rest thus:
R_From >= CT_From AND R_From < CT_To INTERVAL 1 DAY
AND ((R_From BETWEEN '2021-02-01 00.00.00'
AND '2021-02-28 23.59.59')
OR (R_To BETWEEN '2021-02-01 00.00.00'
AND '2021-02-28 23.59.59')
OR (R_From <= '2021-02-01 00.00.00'
AND R_To >= '2021-02-28 23.59.59')
)
Is R_From
guaranteed to be < R_To
? If so, would this simplification (OR-removal) do the same
AND R_From < '2021-03-01'
AND R_To >= '2021-02-01'
This requires two passes over the intermediate results:
GROUP BY R_ID
ORDER BY R_From ASC
This requires one pass, while usually giving the same results, maybe even better results:
GROUP BY R_From, R_ID
ORDER BY R_From, R_ID
(Pet peeve: Don't prefix column names with the table name (or 'R_'); Do use aliases for all columns when JOINing: SELECT R.ID ... FROM reports AS R JOIN ...
)
Another Answer has mentioned some INDEXes
; that may give you a lot of speed-up. After some of my suggestions, there may be more index tips.
TEXT
columns have some overhead; many of the cases you list could be done with something smaller, like VARCHAR(100)
. For example, currently, the longest "city" name in the world has only 91 chars: "Poselok Uchebnogo Khozyaystva Srednego Professionalno-Tekhnicheskoye Uchilishche Nomer Odin"
You seem to be running an old version of MySQL? Else you might have been dinged for the GROUP BY
; cf "only_full_group_by".