Why is this query taking so long to execute?-CodePudding

I have this query below, and with just about 200k records in table, this query has started taking too long to execute. About 30 seconds or so.

I am not sure where or what is causing the problem.

I have other databases, with more than 2 million records, no issues of speed.

But somehow, for some reason, this query is causing problem on a site.

select p.pid, p.other_fields, c.user_name,
        group_concat( t.tag ) as tags 
from post_table as p, user_table as c, tag_table as t
where p.userID= c.userID 
and p.stat=1 
and p.mainID=0 
and c.stat='y' 
and  t.pid=p.pid 
group by p.pid 
order by p.pid desc 
limit 0, 20

This is the proper JOIN format of the same query, makes no difference, still slow. This below is actually what I had earlier, but then changed it to the above older format, just to try if it makes any difference.

select p.pid, p.other_fields, c.user_name, group_concat( t.tag ) as tags 
from post_table as p
    LEFT JOIN user_table as c on p.userID = c.userID  
    LEFT JOIN tag_table as t on p.pid = t.pid 
where p.stat=1 
and p.mainID=0 
and c.stat='y'
group by p.pid 
order by p.pid desc 
limit 0, 20

Structures and indexes on these tables:

post_table:
pid, userID, stat, mainID, title, other_fields...
index( userID, stat, mainID, title )

User_table:
userID, stat, user_name, pass_word, etc...
index( user_name, pass_word )
index( stat )

tag_table:
id, pid, tag
index( pid, tag )

I think I am following all indexes properly, but still the query takes lot of time to execute, and I don't know why.

Can someone please tell me what could be the reason? Thanks

Below is the output of the EXPLAIN statement of this query above:

But I am not sure what this is doing, however, I think that for some reason its ignoring the "stat" index on both user_table and post_table.

3 in array

Array
(
    [0] => Array
        (
            [id] => 1
            [select_type] => SIMPLE
            [table] => c
            [type] => ALL
            [possible_keys] => PRIMARY,id,id_2, userStat
            [key] => 
            [key_len] => 
            [ref] => 
            [rows] => 8
            [Extra] => Using where; Using temporary; Using filesort
        )

    [1] => Array
        (
            [id] => 1
            [select_type] => SIMPLE
            [table] => p
            [type] => ref
            [possible_keys] => PRIMARY,id,id_2, userID, postmainID
            [key] => userID
            [key_len] => 27
            [ref] => 
            [rows] => 15091
            [Extra] => 
        )

    [2] => Array
        (
            [id] => 1
            [select_type] => SIMPLE
            [table] => t
            [type] => ref
            [possible_keys] => pid
            [key] => pid
            [key_len] => 777
            [ref] => 
            [rows] => 1
            [Extra] => Using where; Using index
        )

)

CodePudding user response：

select  p.pid, p.other_fields, c.user_name,
        ( SELECT group_concat( t.tag ) FROM tag_table AS t
              WHERE t.pid = p.pid ) as tags
    FROM  post_table as p
    JOIN  user_table as c  ON p.userID = c.userID
    where  p.stat = 1
      and  p.mainID = 0
      and  c.stat = 'y'
    order by  p.pid desc
    limit  0, 20

p:  INDEX(stat, mainID, pid, userID,  other_fields)
c:  INDEX(userID, stat,  user_name)
t:  INDEX(pid,  tag)

The GROUP BY p.pid is probably redundant now, put it back in if you need it.

There is no performance difference between the old comma-join and the new JOIN..ON. There is a possible semantic difference between JOIN/comma-join and LEFT JOIN. My reformulation for tags is equivalent to LEFT JOIN. The existence of c.stat = ... forces the other LEFT JOIN to turn into JOIN, so no semantic difference.

pid is key_len=777? Please provide SHOW CREATE TABLE so I can understand. Ditto for userID and 27.

There are a lot of possible reasons for the Optimizer to avoid a given index. It will probably use my indexes in preference to all others.