Home > other >  how to rewrite a join query with CTE?
how to rewrite a join query with CTE?

Time:11-10

here I have tried to rewrite the query with cte cuz of good readability but when I try to rewrite the data is mismatched how to solve the problem for this?

Query;

select count(1) as rage_tap
from ue_summary.summary_funnel_1066 s
join user_tasks_metadata utm on utm.asi = s.asi
join user_tasks ut on ut.id = utm.user_task_id
where s.seq_no = 1
   and s.created_at between '2022-09-27 00:00:00' and '2022-10-27 00:00:00'
   and ut.is_ragetap = 1


Explain plan ;

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: ut
   partitions: NULL
         type: ref
possible_keys: PRIMARY,idx_ir
          key: idx_ir
      key_len: 1
          ref: const
         rows: 8413412
     filtered: 100.00
        Extra: Using index
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: utm
   partitions: NULL
         type: ref
possible_keys: id_asi,asi
          key: id_asi
      key_len: 8
          ref: ue_stage.ut.id
         rows: 1
     filtered: 100.00
        Extra: Using index
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: s
   partitions: NULL
         type: eq_ref
possible_keys: PRIMARY,unique_asi_seq_no,seq_no_date,created_at,idx_combo,idx_seq_created_asi
          key: unique_asi_seq_no
      key_len: 12
          ref: ue_stage.utm.asi,const
         rows: 1
     filtered: 50.00
        Extra: Using where; Using index

Table structure;

Create Table: CREATE TABLE `summary_funnel_1066` (
  `funnel_id` int DEFAULT NULL,
  `app_id` int DEFAULT NULL,
  `platform` int DEFAULT NULL,
  `app_version_id` int NOT NULL,
  `seq_no` int NOT NULL,
  `property_id` bigint DEFAULT NULL,
  `property_name` varchar(255) DEFAULT NULL,
  `property_type` varchar(50) DEFAULT NULL,
  `asi` bigint NOT NULL,
  `created_at` datetime NOT NULL,
  `capture_time_relative` decimal(15,4) DEFAULT NULL,
  `last_event_id` bigint DEFAULT NULL,
  `last_event_name` varchar(100) DEFAULT NULL,
  `last_message_id` bigint DEFAULT NULL,
  `last_message_name` varchar(100) DEFAULT NULL,
  `last_tag_id` bigint DEFAULT NULL,
  `last_tag_name` varchar(100) DEFAULT NULL,
  `is_crash` tinyint DEFAULT NULL,
  `is_anr` tinyint DEFAULT NULL,
  `is_ragetap` tinyint DEFAULT NULL,
  `last_error_type_id` bigint DEFAULT NULL,
  `last_error_type` varchar(100) DEFAULT NULL,
  `screen_id` bigint DEFAULT NULL,
  `screen_name` varchar(100) DEFAULT NULL,
  `last_screen_id` bigint DEFAULT NULL,
  `last_screen_name` varchar(100) DEFAULT NULL,
  `user_task_id` bigint DEFAULT NULL,
  `ue_id` bigint DEFAULT NULL,
  PRIMARY KEY (`asi`,`seq_no`,`created_at`,`app_version_id`),
  UNIQUE KEY `unique_asi_seq_no` (`asi`,`seq_no`),
  KEY `seq_no_date` (`seq_no`,`created_at`),
  KEY `last_ids` (`last_screen_id`,`last_event_id`),
  KEY `idx_seq_created_asi`(seq_no,created_at,asi),
  KEY `created_at` (`created_at`),
  KEY `idx_combo` (`seq_no`,`property_id`,`property_name`,`created_at`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1


       Table: user_tasks_metadata
Create Table: CREATE TABLE `user_tasks_metadata` (
  `id` bigint NOT NULL AUTO_INCREMENT,
  `user_task_id` bigint NOT NULL,
  `device_id` bigint NOT NULL,
  `custom_user_id` bigint DEFAULT NULL,
  `asi` bigint NOT NULL DEFAULT '0',
  `session_id` varchar(300) DEFAULT NULL,
  `model` bigint DEFAULT NULL,
  `api_level` varchar(300) DEFAULT NULL,
  `app_version_id` bigint NOT NULL DEFAULT '0',
  `os_version` bigint DEFAULT NULL,
  `location` bigint DEFAULT NULL,
  `connection_speed` varchar(10) DEFAULT NULL,
  `network_operator` varchar(100) CHARACTER SET utf8mb3 COLLATE utf8_general_ci DEFAULT NULL,
  `config_response` tinyint DEFAULT '1',
  `total_internal_memory` double(12,5) DEFAULT NULL,
  `available_internal_memory` double(12,5) DEFAULT NULL,
  `total_ram` double(12,5) DEFAULT NULL,
  `available_ram` double(12,5) DEFAULT NULL,
  `framework` varchar(45) DEFAULT '',
  `ue_sdk_version` mediumint DEFAULT NULL,
  `crash_type` bigint DEFAULT NULL,
  `created_at` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `user_profile_id` bigint DEFAULT NULL,
  `associated_custom_user_id` bigint DEFAULT NULL,
  `first_usr_interaction` bigint DEFAULT NULL,
  `app_launch_type` varchar(45) DEFAULT '',
  `app_launch_time` bigint DEFAULT '0',
  PRIMARY KEY (`id`),
  KEY `session_metadata_filter_idx` (`custom_user_id`,`device_id`),
  KEY `usertask_fk_idx` (`user_task_id`),
  KEY `idx_app_version` (`app_version_id`),
  KEY `asi_idx` (`asi`),
  KEY `device_id` (`device_id`),
  KEY `user_profile_id` (`user_profile_id`),
  KEY `id_asi` (`user_task_id`,`asi`),
  KEY `asi` (`asi`)
) ENGINE=InnoDB AUTO_INCREMENT=2252872743 DEFAULT CHARSET=latin1


       Table: user_tasks
Create Table: CREATE TABLE `user_tasks` (
  `id` bigint NOT NULL AUTO_INCREMENT,
  `app_id` bigint NOT NULL,
  `status` tinyint NOT NULL DEFAULT '0',
  `app_version` varchar(100) DEFAULT NULL,
  `platform` tinyint NOT NULL DEFAULT '1',
  `exception_type` tinyint NOT NULL DEFAULT '0',
  `error_count` smallint NOT NULL DEFAULT '0',
  `crash_type` varchar(300) DEFAULT NULL,
  `crash_log` varchar(300) DEFAULT NULL,
  `avg_signal_level` int DEFAULT '0',
  `is_read` tinyint(1) NOT NULL DEFAULT '0',
  `is_important` tinyint(1) NOT NULL DEFAULT '0',
  `is_video_available` tinyint(1) NOT NULL DEFAULT '0',
  `is_video_played` tinyint(1) NOT NULL DEFAULT '0',
  `is_ex` tinyint(1) NOT NULL DEFAULT '0',
  `is_ragetap` tinyint(1) NOT NULL DEFAULT '0',
  `session_start_time` datetime DEFAULT NULL,
  `network_type` tinyint NOT NULL DEFAULT '0',
  `s3_video_url` varchar(255) DEFAULT NULL,
  `image_format` tinyint DEFAULT '0',
  `ue_release_version` smallint NOT NULL DEFAULT '0',
  `created_at` datetime NOT NULL,
  `updated_at` datetime DEFAULT NULL,
  `batch_created_at` datetime DEFAULT NULL,
  `sys_creation_date` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`),
  KEY `session_filter_idx_2` (`app_id`,`platform`,`created_at`,`exception_type`,`app_version`),
  KEY `batch_created_idx` (`app_id`,`platform`,`batch_created_at`),
  KEY `app_id_created_at` (`app_id`,`created_at`),
  KEY `id_app_id` (`app_id`),
  KEY `idx_ir` (`is_ragetap`)
) ENGINE=InnoDB AUTO_INCREMENT=1648177712 DEFAULT CHARSET=latin1

rewritten query;

with cte1 as (
    select asi,count(1) as rage_tap
    from ue_summary.summary_funnel_1066 
    where s.seq_no = 1
        and s.created_at between '2022-09-27 00:00:00' and '2022-10-27 00:00:00'
),
 cte2 as (
    select id, count(*) 'rage_tap1' 
    from  user_tasks ut where is_ragetap = 1 
) 

select cte1.*,cte2.* from cte1
inner join user_tasks_metadata utm on utm.asi = cte1.asi
inner join cte2 on b.id = utm.user_task_id 

I need like below output;

 ---------- 
| rage_tap |
 ---------- 
|  1812564 |
 ---------- 

It takes time to search so I choose cte, I have tried with subquery but it does not work and it takes around 30 sec - 1.14 min.

as per this, I have indexed the column but also takes time : slow performance of query and scanning many rows

is there any other way to optimize it?

CodePudding user response:

This is your query with two CTEs. The aggregation takes place after the tables after the joins, just as in the original query.

with s as 
(
  select *
  from ue_summary.summary_funnel_1066
  where seq_no = 1
  and created_at >= date '2022-09-27'
  and created_at <  date '2022-10-27'
)
, ut as
(
  select * 
  from user_tasks 
  where is_ragetap = 1
) 
select count(*) as rage_tap
from s
join user_tasks_metadata utm on utm.asi = s.asi
join ut on ut.id = utm.user_task_id;

As created_at is a datetime, you should not use BETWEEN, but >= and <. Please check if the date range that I put in my query matches your requirements. It excludes 2022-10-27. If you want to include it, change this to and created_at < date '2022-10-28'.

CodePudding user response:

select id, count(*) 'rage_tap1' 
from  user_tasks ut where is_ragetap = 1 

does not make sense. There is an aggregate (COUNT(*)) but no GROUP BY. Were you showing to get one row? If so, which row? GROUP BY id does not make since id is Unique.

DOUBLE(m,n) is worse than simply DOUBLE. In fact, (m,n) is going away in 8.0 for FLOAT and DOUBLE.

When you have INDEX(a,b), you don't need INDEX(a).

count(1) as rage_tap is done after the JOINs, so it may have an inflated value. Did you do a sanity check?

utm has two indexes on asi. Toss both and add INDEX(asi, user_task_id)

As for turning the Joins into Ctes, go back to when you envisioned the query. You probably said "I need this stuff from this table", then "that stuff from that table", and finally "put things together this way". If you can go back to that thought process, you have the CTEs. (I don't have any idea what the data means or what the goal is, so I cannot reproduce that thought process.)

  • Related