Home > Software engineering >  MySQL Character Set & Select Query Performance in store procedure
MySQL Character Set & Select Query Performance in store procedure

Time:04-07

Recently i noticed few queries are taking very long time in execution, checked further and found that MySQL Optimizer is trying to use COLLATE in Where clause and that's causing performance issue, if i run below query without COLLATE then getting quick response from database.

 SELECT notification_id FROM notification  
    WHERE ref_table = 2 
    AND ref_id =  NAME_CONST('v_wall_detail_id',_utf8mb4'c37e32fc-b3b5-11ec-befc-02447a44a47c' COLLATE 'utf8mb4_unicode_ci')

MySQL version 5.7
Database Character Set: utf8mb4
Column Character set: UTF8
Column Data Type: CHAR(36) UUID
From PHP in Connection object passing: utf8mb4
Index is applied
Above query is written in MySQL store procedure

SHOW CREATE TABLE

CREATE TABLE `notification` (
  `notification_id` CHAR(36) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,    
  `title` VARCHAR(500) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `created` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,  
  PRIMARY KEY (`notification_id`)
) ENGINE=INNODB DEFAULT CHARSET=utf8mb4

SHOW VARIABLES LIKE 'coll%';

collation_connection  utf8_general_ci                                 
collation_database   utf8mb4_unicode_ci                               
collation_server      latin1_swedish_ci                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        

SHOW VARIABLES LIKE 'char%';

character_set_client, Connection,Result, System:   utf8
character_set_database      utf8mb4 
character_set_server       latin1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
character_set_system       utf8    
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 

Any suggestion, what improvements are needed to make my queries faster?

CodePudding user response:

The table's character set is utf8, so I guess its collation is one of utf8_general_ci or utf8_unicode_ci. You can check this way:

SELECT collation_name from INFORMATION_SCHEMA.COLUMNS
WHERE table_schema = '...your schema...' AND table_name = 'notification'
  AND column_name = 'ref_id';

You are forcing it to compare to a string with a utf8mb4 charset and collation. An index is a sorted data structure, and the sort order depends on the collation of the column. Using that index means taking advantage of the sort order to look up values rapidly, without examining every row.

When you compared the column to a string with a different collation, MySQL cannot infer that the sort order or string equivalence of your UUID constant is compatible. So it must do string comparison the hard way, row by row.

This is not a bug, this is the intended way for collations to work. To take advantage of the index, you must compare to a string with a compatible collation.


I tested and found that the following expressions fail to use the index:

Different character set, different collation:

WHERE ref_id = _utf8mb4'c37e32fc-b3b5-11ec-befc-02447a44a47c' COLLATE utf8mb4_general_ci

WHERE ref_id = _utf8mb4'c37e32fc-b3b5-11ec-befc-02447a44a47c' COLLATE utf8mb4_unicode_ci

Same character set, different collation:

WHERE ref_id = _utf8'c37e32fc-b3b5-11ec-befc-02447a44a47c' COLLATE 'utf8_unicode_ci'

The following expressions successfully use the index:

Different character set, default collation:

WHERE ref_id = _utf8mb4'c37e32fc-b3b5-11ec-befc-02447a44a47c'

Same character set, same collation:

WHERE ref_id = _utf8'c37e32fc-b3b5-11ec-befc-02447a44a47c' COLLATE 'utf8_general_ci'

Same character set, default collation:

WHERE ref_id = _utf8'c37e32fc-b3b5-11ec-befc-02447a44a47c'

To simplify your environment, I recommend you should just use one character set and one collation in all tables and in your session. I suggest:

ALTER TABLE notification CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

This will rebuild the indexes on string columns, using the sort order for the specified collation.

Then using COLLATE utf8mb4_unicode_ci will be compatible, and will use the index.

P.S. In all cases I omitted the NAME_CONST() function, because it has no purpose in a WHERE clause as far as I know. I don't know why you are using it.

CodePudding user response:

These say what the client is talking in:

collation_connection  utf8_general_ci
character_set_client, Connection,Result, System:   utf8

Either change them or change the various columns to match them.

If you have Stored routines, they need to be dropped, do SET NAMES to match what you picked, then re-CREATEd.

Since you are using 5.7, I recommend using utf8mb4 and utf8mb4_unicode_520_ci throughout.

  • Related