Home > Enterprise >  MySQL multipe subqueries result in very slow response
MySQL multipe subqueries result in very slow response

Time:11-04

I have two tables contacts and calllist. contacts has multiple columns containing phone numbers. calllist has only one column from_number containing phone numbers. I'm trying to get all phone numbers from the column from_number which do not match the phone numbers in the table calllist.

Here is my working but probably very inefficient and slow SQL query:

SELECT from_number AS phone_number, COUNT(from_number) AS number_of_calls
FROM calllist
WHERE from_number NOT IN (
    SELECT businessPhone1
    FROM contacts
    WHERE businessPhone1 IS NOT NULL
)
AND from_number NOT IN (
    SELECT businessPhone2
    FROM contacts
    WHERE businessPhone2 IS NOT NULL
)
AND from_number NOT IN (
    SELECT homePhone1
    FROM contacts
    WHERE homePhone1 IS NOT NULL
)
AND from_number NOT IN (
    SELECT homePhone2
    FROM contacts
    WHERE homePhone2 IS NOT NULL
)
AND from_number NOT IN (
    SELECT mobilePhone
    FROM contacts
    WHERE mobilePhone IS NOT NULL
)
AND (received_at BETWEEN '$startDate' AND DATE_ADD('$endDate', INTERVAL 1 DAY))
GROUP BY phone_number
ORDER BY number_of_calls DESC
LIMIT 10

How do i rewrite this SQL query to be faster? Any help would be much appreciated.

CodePudding user response:

try this

    SELECT from_number AS phone_number, COUNT(from_number) AS number_of_calls
    FROM calllist
    WHERE from_number NOT IN (
        SELECT businessPhone1
        FROM contacts
        WHERE businessPhone1 IS NOT NULL
UNION
SELECT businessPhone2
        FROM contacts
        WHERE businessPhone2 IS NOT NULL
UNION
SELECT homePhone1
        FROM contacts
        WHERE homePhone1 IS NOT NULL
UNION
SELECT homePhone2
        FROM contacts
        WHERE homePhone2 IS NOT NULL
UNION
SELECT mobilePhone
        FROM contacts
        WHERE mobilePhone IS NOT NULL
    )
    AND (received_at BETWEEN '$startDate' AND DATE_ADD('$endDate', INTERVAL 1 DAY))
    GROUP BY phone_number
    ORDER BY number_of_calls DESC
    LIMIT 10

CodePudding user response:

I don't like the schema design. You have multiple columns holding 'identical' data -- namely phone numbers. What if technology advances and you need a 6th phone number??

Instead, have a separate table of phone numbers, with linkage (id) to JOIN back to calllist. That gets rid of all the slow NOT IN ( SELECT... ), avoids a messy UNION ALL, etc.

If you desire, the new table could have a 3rd column that says which type of phone it is.

ENUM('unknown', 'company', 'home', 'mobile')

The simplified query goes something like

SELECT cl.from_number AS phone_number,
       COUNT(*) AS number_of_calls
    FROM calllist AS cl
    LEFT JOIN phonenums AS pn  ON cl.id = pn.user_id
    WHERE cl.received_at >= '$startDate' AND
      AND cl.received_at  < '$endDate'   INTERVAL 1 DAY
      AND pn.number IS NULL  -- not found in phonenums
GROUP BY phone_number
ORDER BY number_of_calls DESC
LIMIT 10
  • Related