Home > Mobile >  Optimizing a complex mysql to reduce query time
Optimizing a complex mysql to reduce query time

Time:10-08

I have a very complex query, I'm already optimized a lot, But I couldn't find a better way to write a more optimized query and reduce the query timings. Let me share details so you can understand it better.

Below is my query.

SELECT a.msisdn, GROUP_CONCAT( COALESCE(a.answer, 'skip') order by a.question_id separator ',') as answer,

(SELECT GROUP_CONCAT( COALESCE(answer, 'skip') order by question_id separator ',') as answer FROM  `campaign_survey_responses`
 WHERE campaign_id = 11559 and question_id=14751 and msisdn=a.msisdn) as a1,
 
 (SELECT GROUP_CONCAT( COALESCE(s.answer, 'skip') order by s.question_id separator ',') as answer 
FROM  `campaign_survey_responses` s left join campaign_survey_questions q on q.id = s.question_id 
WHERE s.campaign_id = 11559 and q.parent_id=5128 and q.sort_order = 0 and s.msisdn=a.msisdn) as sur1,
 
(SELECT GROUP_CONCAT( COALESCE(answer, 'skip') order by question_id separator ',') as answer FROM  `campaign_survey_responses`
WHERE campaign_id = 11559 and question_id=14768 and msisdn=a.msisdn) as a2,

(SELECT GROUP_CONCAT( COALESCE(s.answer, 'skip') order by s.question_id separator ',') as answer 
FROM  `campaign_survey_responses` s left join campaign_survey_questions q on q.id = s.question_id 
WHERE s.campaign_id = 11559 and q.parent_id=5108 and q.sort_order = 0 and s.msisdn=a.msisdn) as sur2,

(SELECT GROUP_CONCAT( COALESCE(answer, 'skip') order by question_id separator ',') as answer FROM  `campaign_survey_responses`
WHERE campaign_id = 11559 and question_id=14785 and msisdn=a.msisdn) as a3,

(SELECT GROUP_CONCAT( COALESCE(s.answer, 'skip') order by s.question_id separator ',') as answer 
FROM  `campaign_survey_responses` s left join campaign_survey_questions q on q.id = s.question_id 
WHERE s.campaign_id = 11559 and q.parent_id=5148 and q.sort_order = 0 and s.msisdn=a.msisdn) as sur3

FROM  `campaign_survey_responses` a
WHERE a.campaign_id = 11559 and a.question_id=14750 group by msisdn limit 500;

Above query combines 1 millions rows to make around 50,000 rows only, Now only purpose is to export as csv. But the above query takes too much time when it runs.

I have applied limit of 500 from that query, 500 rows limit is taking around 78.844856977463 seconds, but what about 50K rows? It's going to take down the server.

Any better approach to optimize the query, I'm using Mysql 5.7, Thanks

It returns data like below, for limit 500, there will be 500 record, below are just 2 record as an example.

0 => 
    array (size=8)
      'msisdn' => string '3003932957' (length=10)
      'answer' => string '1' (length=1)
      'answer1' => string '4' (length=1)
      'survey1' => null
      'answer2' => null
      'survey2' => null
      'answer3' => null
      'survey3' => null
  1 => 
    array (size=8)
      'msisdn' => string '3013555354' (length=10)
      'answer' => string '1' (length=1)
      'answer1' => string '3' (length=1)
      'survey1' => string '2,1,1' (length=5)
      'answer2' => null
      'survey2' => null
      'answer3' => null
      'survey3' => null
  2 => 

Table Structure:

CREATE TABLE `campaign_survey_responses` (
 `id` int unsigned NOT NULL AUTO_INCREMENT,
 `campaign_id` int unsigned NOT NULL,
 `question_id` int unsigned NOT NULL,
 `answer` varchar(20) DEFAULT NULL,
 `msisdn` int unsigned NOT NULL,
 `campaign_date` date DEFAULT NULL,
 PRIMARY KEY (`id`),
 KEY `campaign_id` (`campaign_id`),
 KEY `question_id` (`question_id`)
) ENGINE=MyISAM AUTO_INCREMENT=9016122 DEFAULT CHARSET=utf8

Table structure is added, Basically the system generated calls, based on user input, it gets responses like 1,2,3 etc are added to above table. For example, Its a type of multi level survey. There is one parent question, It has 3 choices, If user choose choice 2, the system will play question related to choice 2. After that question, if user choose any choice, there will be series of 10 questions against choice 2.

Similarly for choice 1, and 3.

Parent Question ( it has 3 sub questions )
Each 3 of subquestions has 10 questions each.

If user choose sub question 1, system will play 10 questions related to sub question 1, He will not be played other related questions for choice 2,3. It depends on user, which question, he wants to play.

CodePudding user response:

These composite indexes on campaign_survey_responses will help a bunch:

INDEX(campaign_id, question_id,  msisdn, answer)
INDEX(campaign_id, msisdn, question_id,  answer)

(The first will probably help a; the second will probably help s.)

CodePudding user response:

The correlated subquery in the SELECT can be a real drag. And you have many of these. If this is unavoidable, I would apply it to subset of results and not entire result. i.e. I would create inline view first

SELECT 
   -- your correlated subqueries as fields here
FROM
   ( -- start inline view
       SELECT -- your columns
       FROM  `campaign_survey_responses` a
       WHERE a.campaign_id = 11559 and a.question_id=14750 
       GROUP BY msisdn 
       LIMIT 500;
   ) T1

This way only 500 records will be involved in correlated subqueries

  • Related