I have a query and it seems very slow
My Problem
select conversation_hash as search_hash
from conversation
where conversation_hash ='xxxxx'
and result_published_at between '1600064000' and '1610668799'
order by result_published_at desc
limit 5
There is a total of 773179
Records when I run
select count(*)
from conversation
where conversation_hash ='xxxxx'
After I do an explain query
explain select conversation_hash as search_hash
from conversation
where conversation_hash ='xxxxx'
and result_published_at between '1600064000' and '1610668799'
order by result_published_at desc
limit 5
i got this
id,select_type,table,partitions,type,possible_keys,key,key_len,ref,rows,filtered,extra
1, SIMPLE, conversation, , range, idx_result_published_at,conversation_hash_channel_content_id_index,conversation_result_published_at_index,virtaul_ad_id_conversation_hash, idx_result_published_at, 5, , 29383288, 1.79, Using index condition;Using where
Possible Issues
- By looking in the explain query I can see it return more rows(29383288) than the total Records (ie 773179)
- key_len is 5. result_published_at is a timestamp field and its length is def more than 5 eg(1625836640)
What can I improve to make this query Fast, Thanks in advance
EDIT
Indexes for conversation
Table,Non_unique,Key_name,Seq_in_index,Column_name,Collation,Cardinality,Sub_part,Packed,Null,Index_type,Comment,Index_comment
conversation,0,PRIMARY,1,id,A,96901872,NULL,NULL,,BTREE,,
conversation,0,conversation_conversation_hash_id_result_id_unique,1,conversation_hash_id,A,240485,NULL,NULL,,BTREE,,
conversation,0,conversation_conversation_hash_id_result_id_unique,2,result_id,A,100693480,NULL,NULL,,BTREE,,
conversation,0,conversation_conversation_hash_id_channel_content_id_unique,1,conversation_hash_id,A,232122,NULL,NULL,,BTREE,,
conversation,0,conversation_conversation_hash_id_channel_content_id_unique,2,channel_content_id,A,100693480,NULL,NULL,,BTREE,,
conversation,1,conversation_tool_id_foreign,1,tool_id,A,7788,NULL,NULL,,BTREE,,
conversation,1,idx_result_published_at,1,result_published_at,A,38164712,NULL,NULL,YES,BTREE,,
conversation,1,idx_user_name,1,user_name,A,10896208,NULL,NULL,YES,BTREE,,
conversation,1,conversation_hash_channel_content_id_index,1,conversation_hash,A,294048,NULL,NULL,,BTREE,,
conversation,1,conversation_hash_channel_content_id_index,2,channel_content_id,A,99699696,NULL,NULL,,BTREE,,
conversation,1,idx_parent_channel_content_id,1,parent_channel_content_id,A,3550741,NULL,NULL,YES,BTREE,,
conversation,1,idx_channel_content_id,1,channel_content_id,A,90350472,NULL,NULL,,BTREE,,
conversation,1,conversation_result_published_at_index,1,result_published_at,A,37177476,NULL,NULL,YES,BTREE,,
conversation,1,virtaul_ad_id_conversation_hash,1,conversation_hash,A,238906,NULL,NULL,,BTREE,,
conversation,1,virtaul_ad_id_conversation_hash,2,virtual_ad_id,A,230779,NULL,NULL,YES,BTREE,,
conversation,1,idx_ad_story_id,1,ad_story_id,A,167269,NULL,NULL,YES,BTREE,,
CodePudding user response:
Query is correct, it seems you have to update server configurations for mysql, which is probably not available on a shared hosting environment. However, if you have your own server the follow these steps:
- Go to
my.cnf
file, in my case it is hosted at/etc/mysql/my.cnf
- Increate the values of
query_cache_size
,max_connection
,innodb_buffer_pool_size
,innodb_io_capacity
- Switch from
MyISAM
toInnoDB
(if possible) - Use latest MySQL version (if possible)
You can get more help from this article https://phoenixnap.com/kb/improve-mysql-performance-tuning-optimization
CodePudding user response:
It's a bit hard to read the output of the Explain command because the possible_keys
output is separated by commas.
Depending on the data access patterns, you might want to create a unique index on conversation_hash
, in case rows are unique.
If conversation_hash
is not a unique field you can create a compound index on conversation_hash, result_published_at
so your query will be fulfilled from the index itself.
CodePudding user response:
EXPLAIN
estimates the row counts. (It has no way to get the exact number of rows without actually running the query.) That estimate may be a lot lower or higher than the real count. It is rare for the estimate to be that far off, but I would not be worried, just annoyed.
The existence of Text and Blob columns sometimes adds to the imprecision of Explain.
Key_len:
- The raw length, which is 5 for
TIMESTAMP
(more below). - 1 for if
NULL
( 0 forNOT NULL
). - Not very useful for
VARCHAR
.
In older versions of MySQL, a Timestamp took 4 bytes and a DATETIME took 8. When fractional seconds were added, those numbers were changed to 5 in both cases. This allowed for a "length" to indicate the number of decimal places. And Datetime was changed from packed decimal to an integer.
Suggest you run ANALYZE TABLE
. This might improve the underlying statistics that feed into the estimates.
Please provide SHOW CREATE TABLE
; it may give more insight.
The 'composite' INDEX(conversation_hash, result_published_at)
, in that order, is optimal for that query.