Home > Enterprise >  do i have to add index for the below query or the query has some problem on it?
do i have to add index for the below query or the query has some problem on it?

Time:10-17

The below query is going for a full table scan and doesn't know where it has a problem.

Query:

select  journey_level, 
        src_screen_name from_entity_name, 
        src_screen_id from_entity_id, 
        dest_screen_name to_entity_name, 
        dest_screen_id to_entity_id, 
        dest_screen_id to_entity_id_list, 
        concat(journey_level, '-', src_screen_id, '-', src_screen_name) from_entity, 
        concat(journey_level   1, '-', dest_screen_id, '-', dest_screen_name) to_entity, 
        count(1) cnt, 
        sum(ifnull(is_crash, 0)) crash_cnt, 
        sum(ifnull(is_anr, 0)) anr_cnt, 
        sum(ifnull(is_rage, 0)) rage_cnt, 
        case dest_screen_id 
            when -1 
            then 'Red' 
            else '' 
        end node_color 
from ue_summary.user_journey_screens_1943 s 
where first_session_created_at between '2022-09-17 00:00:00' and '2022-10-17 10:42:35' 
and s.platform = '1' 
and journey_level <= '4' 
group by journey_level, 
        from_entity_id, 
        from_entity_name, 
        to_entity_id, 
        to_entity_name;

explain plan ;

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: s
   partitions: NULL
         type: ALL
possible_keys: date_platform_level
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 799471542
     filtered: 1.67
        Extra: Using where; Using temporary

table structure;

*************************** 1. row ***************************
       Table: user_journey_screens_1943
Create Table: CREATE TABLE `user_journey_screens_1943` (
  `id` bigint NOT NULL AUTO_INCREMENT,
  `app_id` int DEFAULT NULL,
  `platform` tinyint DEFAULT NULL,
  `asi` bigint DEFAULT NULL,
  `first_session_created_at` datetime DEFAULT NULL,
  `user_task_id` bigint DEFAULT NULL,
  `app_version_id` int DEFAULT NULL,
  `journey_level` int DEFAULT NULL,
  `src_screen_id` bigint DEFAULT NULL,
  `src_screen_name` varchar(300) DEFAULT NULL,
  `src_capture_time` bigint DEFAULT NULL,
  `src_capture_time_relative` bigint DEFAULT NULL,
  `dest_screen_id` bigint DEFAULT NULL,
  `dest_screen_name` varchar(300) DEFAULT NULL,
  `dest_capture_time` bigint DEFAULT NULL,
  `dest_capture_time_relative` bigint DEFAULT NULL,
  `is_crash` tinyint DEFAULT NULL,
  `is_anr` tinyint DEFAULT NULL,
  `is_rage` tinyint DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `app_id_date_level` (`app_id`,`first_session_created_at`,`journey_level`),
  KEY `asi` (`asi`),
  KEY `date_platform_level` (`first_session_created_at`,`platform`,`journey_level`)
) ENGINE=InnoDB AUTO_INCREMENT=1937612717 DEFAULT CHARSET=latin1

did I miss any index or due to any other?

CodePudding user response:

Your query's WHERE clause contains two value-range filters, one on the DATETIME column and another on an INT with apparently small values. This is in addition to one equality filter on a TINYINT.

MySQL's indexes are BTREE indexes. You can think of them as lookup tables sorted in order. So, for a query to use an index effectively, the index should start with the columns you match for equality, and then end with just one column you range-match. I guess that your DATETIME column is more selective than your TINYINT column, so I guess the index should contain that column.

So, if you put an index on (platform, first_session_created_at) it will help your query. Why?

MySQL can random-access the index to the first eligible row. In your case it's the first row with platform=1 and a date >= the start of your date range.

Then, it can do an index range scan: it can scan the index sequentially until it gets to the last eligible row. Your plan shows, in contrast, that it is scanning through the entire table to get the results, because it decided that would take less time than scanning your index that starts with your DATETIME column. In general, an index range scan much faster than a full table scan. So, you want this:

CREATE INDEX platform_date ON tbl (platform, first_session_created_at);

There's one more possible refinement here: You are doing GROUP BY on several columns. If you include those columns in the index in the same order as your GROUP BY, MySQL may be able to do the grouping operation sequentially from the index, rather than having to build an internal temporary table. This may or may not work. If it does, using temporary won't appear in your EXPLAIN. Try adding columns to the index as follows:

CREATE INDEX platform_date ON tbl (platform, 
                                   first_session_created_at,
                                   journey_level, 
                                   from_entity_id, 
                                   from_entity_name, 
                                   to_entity_id, 
                                   to_entity_nam );
  • Related