Home > front end >  More efficient way to write this MySQL query
More efficient way to write this MySQL query

Time:11-14

I am trying to speed this query up, and I am afraid it's impossible given the large amount of data. But I decided I'll throw it out there and see if anyone has any suggestions. I have two very simple tables:

contractorList

id contractor
1 John's Roofing
2 Mark's Roofing

form_submissions

id contractor_id submission
1 1 ...
2 2 ...
3 1 ...
4 4 ...
5 1 ...
6 5 ...

...ETC

I am trying to get JUST the contractor name, id and count the number of submissions. IE

SELECT a.id, a.contractor, count(b.contractor_id) AS submission_count
   FROM `contractorList` a
   LEFT JOIN contractor_form_submissions b
      ON b.contractor_id = a.id
   GROUP BY a.id
   ORDER BY a.contractor

The issue is that, I have about 1200 contractors .. Run by itself is lightning quick. But when I LEFT JOIN and count() their submissions (which can be in the thousands per contractor) the query grinds .. Like minutes, not seconds grinds. Is there a more efficient SQL only route to obtain the count without the query taking forever?

NOTE that contractorList.id AND form_submissions.id are both PK and indexed

CodePudding user response:

It may be better to do the aggregation before joining, as this creates a smaller intermediate table.

SELECT a.id, a.contractor, IFNULL(b.submission_count, 0) AS submission_count
FROM contractorList AS a
LEFT JOIN (
    SELECT contractor_id, COUNT(*) AS submission_count
    FROM contractor_form_submissions
    GROUP BY contractor_id
) AS b ON a.id = b.contractor_id
ORDER BY a.contractor

CodePudding user response:

I could provide another SQL solution using related subquery which I learned from 《Effective SQL》.
Sample Query: https://www.db-fiddle.com/f/4WX7yN4GWRrA1wX7zb8AXv/0

SELECT 
    a.id,
    a.contractor,
    (
      select count(*)
      from contractor_form_submissions
      where contractor_form_submissions.contractor_id = a.id
    ) as submission_count
FROM contractor as a
order by a.id;

The query efficiency is based on the sql server and its own optimizer. So I am not sure whether this is more efficient or not comparing to JOIN solution.

  • Related