Home > Net >  MySQL statement that was working fine previously all of a sudden started being very slow. It include
MySQL statement that was working fine previously all of a sudden started being very slow. It include

Time:12-22

MySQL statement on a school system I have created some years back was working fine but now takes close to 30 seconds to pull what appears to me a simple statement but cannot figure out how to improve it. I am wondering if anyone can help me re-write this statement for a faster response. The statement is:

SELECT es.*,c.mainsubarea AS subject, b.name,b.email,GROUP_CONCAT(doc.document_file SEPARATOR "|") document_file 
    FROM usersubinfo es 
        LEFT JOIN userinfo b ON (es.uid=b.uid) 
        LEFT JOIN lkptsubjectarea c ON (es.mainsubjectarea=c.id) 
        LEFT JOIN lkptdeliverytime d ON (es.deliverytime = d.id) 
        LEFT JOIN documents doc ON (es.id = doc.order_id) 
    WHERE es.id AND es.is_active='Yes' 
    GROUP BY es.id 
    ORDER BY es.joindate 
    DESC LIMIT 0,25 

See screenshot from Explain statement on phpMyAdmin

CodePudding user response:

Add Indexes: These may help:

b:  INDEX(uid,  name, email)
doc:  INDEX(order_id,  document_file)

Remove LEFT: Is there a reason for LEFT JOIN instead of JOIN? I think not. See if you get the same results without LEFTs.

Remove bogus test: Why WHERE es.id? If id is the PRIMARY KEY of es, that test will always be true.

Improve GROUP ORDER: Change

    GROUP BY  es.id
    ORDER BY  es.joindate DESC
    LIMIT  0,25 

-->

    GROUP BY  es.joindate,      es.id
    ORDER BY  es.joindate DESC, es.id DESC
    LIMIT  0,25 

That avoids two passes over the data -- one for the GROUPing, and another for the ORDERing. Meanwhile, I assume that my grouping and ordering is "just as good".

Turn inside-out: That brings up another issue, one I call "explode-implode". That's where you Join together a lot of rows, only to get rid of most of them. So...

Start by finding the 25 ids desired with as little effort as possible:

SELECT  id
    FROM usersubinfo
    WHERE is_active = 'Yes'
    GROUP BY  joindate,      id
    ORDER BY  joindate DESC, id DESC
    LIMIT  0,25 

And include that as a 'derived' table in the rest:

SELECT  es.*, c.mainsubarea AS subject,
        b.name, b.email,
        GROUP_CONCAT(doc.document_file SEPARATOR "|") document_file
    FROM ( put the above Select here 
         )  AS ids
    JOIN  usersubinfo AS es  USING(id)
    JOIN  userinfo b  ON (es.uid=b.uid)
    JOIN  lkptsubjectarea c  ON (es.mainsubjectarea=c.id)
    JOIN  lkptdeliverytime d  ON (es.deliverytime = d.id)
    JOIN  documents doc  ON (es.id = doc.order_id)
    ORDER BY  joindate DESC, id DESC;  -- yes, repeat this

This is likely to be faster because the tables other than usersubinfo will be touched only 25 times.

(I think this will happen to avoid the "only_full_group_by" issue to which Parfait refers.)

CodePudding user response:

First and foremost, consider writing a valid ANSI SQL aggregate query with changes to your GROUP BY and SELECT clauses. Currently, your query includes only one column in GROUP BY but all columns from usersubinfo table with SELECT es.* in addition to other non-aggregated columns. You even order by a column not in GROUP BY.

Such a query runs against the SQL standard and will fail in most RDBMS's but allowable in MySQL due to its ONLY_FULL_GROUP_BY mode turned off which dangerously allows:

server is free to choose any value from each group, so unless they are the same, the values chosen are nondeterministic, which is probably not what you want

Since you have an aggregate function, GROUP_CONCAT, all other non-aggregated columns should be placed in GROUP BY clause. If you need to add columns in SELECT add it also to GROUP BY. Also, you may have a redundant LEFT JOIN which serves no purpose for other JOINs or columns for SELECT.

SELECT es.id, 
       es.joindate,
       sa.mainsubarea AS subject,
       i.name,
       i.email,
       GROUP_CONCAT(doc.document_file SEPARATOR "|") document_file 
    FROM usersubinfo es 
        LEFT JOIN userinfo i ON (es.uid = i.uid) 
        LEFT JOIN lkptsubjectarea sa ON (es.mainsubjectarea = sa.id) 
        -- LEFT JOIN lkptdeliverytime dlv ON (es.deliverytime = dlv.id)  -- POSSIBLY REDUNDANT
        LEFT JOIN documents doc ON (es.id = doc.order_id) 
    WHERE es.id IS NOT NULL 
      AND es.is_active = 'Yes' 
    GROUP BY es.id,
             es.joindate,
             sa.mainsubarea,
             i.name,
             i.email,
    ORDER BY es.joindate DESC
    LIMIT 0 
    OFFSET 25 

Additionally, by avoiding SELECT * you avoid bringing in unneeded even newer columns, allowing indexes to run effectively over large table scans, and avoid sending large amount of content over network. See Why is SELECT * considered harmful?

  • Related