Home > Mobile >  Slow Mysql query after inner joining 3 tables and adding indexes
Slow Mysql query after inner joining 3 tables and adding indexes

Time:05-04

I am finding my mysql query very slow, at 1 second according to my testing just from this query alone. I have tried adding indexes. But this doesnt seem to work well. i have attached my explain, tables and indexes. Im limited to only mysql 5.6 right now. Im not sure how else to make it faster. Any ideas?

SELECT symbol.name, symbol.code, tbl_company_data.id as dataid, symbol.id as companyid , tbl_company_data.company_size as size 
FROM symbol
INNER JOIN tbl_company_data ON symbol.id=tbl_company_data.company_symbol_id 
INNER JOIN tbl_eta metatbl ON metatbl.symbol_id=symbol.id 
INNER JOIN tbl_eta metatbl_0 ON metatbl_0.symbol_id=symbol.id 
INNER JOIN tbl_eta metatbl_1 ON metatbl_1.symbol_id=symbol.id
WHERE 1 AND ((metatbl_0.meta_key= 'word2' AND metatbl_0.meta_value!=''
     AND CAST(metatbl_0.meta_value AS DECIMAL) BETWEEN CAST(0 AS DECIMAL) AND CAST(1000 AS DECIMAL)) 
    AND (metatbl_1.meta_key= 'word1' AND metatbl_1.meta_value!='' 
    AND CAST(metatbl_1.meta_value AS DECIMAL) BETWEEN CAST(-0.0001 AS DECIMAL) AND CAST(2 AS DECIMAL))) 
GROUP BY companyid 
HAVING 1 
ORDER BY symbol.code DESC 
LIMIT 0, 1500

SHOW CREATE TABLE

CREATE TABLE `tbl_exchange_symbol` (
 `id` bigint(20) NOT NULL AUTO_INCREMENT,
 `code` varchar(150) NOT NULL,
 `name` varchar(255) NOT NULL,
 `country` varchar(100) NOT NULL,
 `created_at` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
 PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=56287 DEFAULT CHARSET=latin1


CREATE TABLE `tbl_company_data` (
 `id` bigint(20) NOT NULL AUTO_INCREMENT,
 `company_symbol_id` bigint(20) NOT NULL,
 `company_size` double(30,2) NOT NULL,
 PRIMARY KEY (`id`),
 UNIQUE KEY `company_symbol_id_2` (`company_symbol_id`),
 KEY `company_symbol_id` (`company_symbol_id`)
) ENGINE=MyISAM AUTO_INCREMENT=18361 DEFAULT CHARSET=latin1

CREATE TABLE `tbl_eta` (
 `id` bigint(20) NOT NULL AUTO_INCREMENT,
 `symbol_id` bigint(20) NOT NULL,
 `meta_key` text NOT NULL,
 `meta_value` longtext NOT NULL,
 `created_at` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
 `updated_at` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
 PRIMARY KEY (`id`),
 KEY `symbol_id` (`symbol_id`),
 KEY `TEST` (`symbol_id`,`meta_key`(255))
) ENGINE=MyISAM AUTO_INCREMENT=470350 DEFAULT CHARSET=latin1

Markdown Sample Data tbl_eta

id symbol_id meta_key meta_value
1 1 word1 1
2 1 word2 0
3 2 word1 0
4 2 word2 0
5 3 word1 1
6 3 word2 25
7 4 word1 10
8 4 word2 8
9 5 word1 3
10 5 word2 22

symbol

id code name country
1 AAA company1 USA
2 AAB company2 USA
3 AAC company3 USA
4 AAD company4 USA
5 AAE company5 USA
6 AAF company6 USA
7 AAG company7 USA
8 AAH company8 USA
9 AAI company9 USA
10 AAJ company10 USA

tbl_company_data

id company_symbol_id company_size
1 1 100000
2 2 200000
3 3 50000000

CodePudding user response:

When you have to CAST() a column value for use in a WHERE clause, it's no longer the same value as what was stored in any index you might have on the column, such that your index is worthless for this query. Worse, you can't know if a value in a given row will match the clause until after the cast, meaning you must run the code to do this conversion on every row in the table, even for values that will ultimately fail your test.

This is one of the reasons (among several) we try to avoid using Entity-Attribute-Value (EAV) schema pattern as seen here in the tbl_eta metatbl table and friends: you can't really index them properly if you need to account for different types of value.

By contrast, if the values in your database were already decimals, you could use the index to jump straight to where the matching 0 would start for the BETWEEN comparisons, walk the index until you get to the end value of 1000, and then stop there. It could make your query run only a small fraction of the time. But again, this depends on actually using a functional schema.

CodePudding user response:

A quick way to optimize your code is by moving your filtering conditions (contained in the WHERE clause) before applying the join operations, like this:

SELECT 
    symbol.name, 
    symbol.code, 
    tbl_company_data.id as dataid, 
    symbol.id as companyid , 
    tbl_company_data.company_size as size 
FROM
    symbol 
INNER JOIN 
    tbl_company_data ON symbol.id=tbl_company_data.company_symbol_id 
INNER JOIN 
    tbl_eta metatbl ON metatbl.symbol_id=symbol.id 
INNER JOIN 
    (
    SELECT *
    FROM   tbl_eta 
    WHERE  meta_key    = 'word2' 
      AND  meta_value != '' 
      AND  CAST(meta_value AS DECIMAL) BETWEEN CAST(0 AS DECIMAL) AND CAST(1000 AS DECIMAL)
    ) metatbl_0 ON metatbl_0.symbol_id=symbol.id 
INNER JOIN 
    (
    SELECT *
    FROM   tbl_eta 
    WHERE  meta_key    = 'word3' 
      AND  meta_value != '' 
      AND  CAST(meta_value AS DECIMAL) BETWEEN CAST(-0.0001 AS DECIMAL) AND CAST(2 AS DECIMAL))) 
    ) metatbl_1 ON metatbl_1.symbol_id=symbol_id
GROUP BY 
    companyid 
HAVING 
    1 
ORDER BY 
    symbol.code DESC 
LIMIT 
    0, 1500

Note: I'm assuming your code is fully working and that you know what you're doing. This solution you see here is just an optimized query of the one you've written.

  • Related