Home > Software design >  Querying phone records
Querying phone records

Time:03-08

I want to use SQL to analyze this call data I have. The main question I want to answer is: If a caller dials a number and doesn't get a response the first time, does the value of some_factor (boolean column) impact the probability of the caller getting a response when redialing the same number later?

Here are the columns in this table:

| id | phone_number | callee_name | caller_id | call_duration | some_factor | created_at          |
|----|--------------|-------------|-----------|---------------|-------------|---------------------|
| 1  | 123-456-7890 | John Smith  | 42        |               | t           | 2022-03-07 09:40:40 |

Here was what I arrived at after some thinking, but I'm relatively inexperienced with SQL queries. If they worked, I could look at the ratio of the count from query 1 compared to the count from query 2, and then redo the queries where some_factor is false.

-- 1. successful retried calls: the callee picked up
SELECT COUNT(*)
FROM phone_calls
WHERE phone_number IN (
    SELECT phone_number
    FROM phone_calls
    WHERE call_duration IS NULL
    AND some_factor
)
AND call_duration IS NOT NULL

-- 2. unsuccessful retried calls: the callee did not pick up
SELECT COUNT(*)
FROM phone_calls
WHERE phone_number IN (
    SELECT phone_number
    FROM phone_calls
    WHERE call_duration IS NULL
    AND some_factor
)
AND call_duration IS NULL

-- then, do math with the results of these queries to get success rate when some_factor is true
-- then, redo but with AND NOT some_factor to get success rate when some_factor is false

But, I realized that 2 is counting entries from the subquery in the total count. I want the COUNT to not include the first time that someone called a number, if they did not pick up. Maybe if I can exclude entries with an id that corresponds to entries from the subquery, that will work. But I can't figure out how to do that. Am I missing something?

CodePudding user response:

As I understand in the second query you're not interested in the results returned by the subquery itself. So the count is as a result larger by the amount of distinct phone numbers returned by the subquery. You only need to subtract that number:

SELECT COUNT(*) - COUNT(DISTINCT phone_number)
FROM phone_calls
WHERE phone_number IN (
    SELECT phone_number
    FROM phone_calls
    WHERE call_duration IS NULL
    AND some_factor
)
AND call_duration IS NULL
  • Related