I haven't used SQL in quite a while, so I'm a bit lost here. I wanted to check for rows with duplicate values in the "Duration" and "date" columns to remove them from the query results. I would need to keep the rows where column = "Transfer" since these hold more information about the call and how it was routed through our system.
I want to use this for a dashboard, which would include counting the total number of calls from that query, which is why I cannot have both.
Here's the (Simplified) code used:
SELECT status, user, duration, phonenumber, date
FROM (SELECT * FROM view_InboundPhoneCalls) as Phonecalls
WHERE date>=DATEADD(dd, -15, getdate())
--GROUP BY duration
Which gives something of the sort:
Status | User | Duration | phonenumber | date |
---|---|---|---|---|
Received | Receptionnist | 00:34:03 | from: 1234567890 | 2021-09-30 16:01:57 |
Received | Receptionnist | 00:03:12 | from: 9876543210 | 2021-09-30 16:02:40 |
Transfer | User1 | 00:05:12 | 14161654965;Receptionnist;User1 | 2021-09-30 16:01:57 |
Received | Receptionnist | 00:05:12 | from: 14161654965 | 2021-09-30 16:01:57 |
The end result would be something like this:
Status | User | Duration | phonenumber | date |
---|---|---|---|---|
Received | Receptionnist | 00:34:03 | from: 1234567890 | 2021-09-30 16:01:57 |
Received | Receptionnist | 00:03:12 | from: 9876543210 | 2021-09-30 16:02:40 |
Transfer | Receptionnist | 00:05:12 | 14161654965;Receptionnist;User1 | 2021-09-30 16:01:57 |
CodePudding user response:
You could go for a blacklist, say with a CTE, then filter out the undesired rows.
Something like:
WITH Blacklist ([date], [duration]) AS (
SELECT [date], [duration] FROM view_InboundPhoneCalls
GROUP BY [date], [duration]
Having count(*) > 1
)
SELECT status, user, duration, phonenumber, date
FROM
(SELECT * FROM view_InboundPhoneCalls) as Phonecalls
LEFT JOIN
Blacklist
ON Phonecalls.[date] = Blacklist.[date]
AND Phonecalls.[duration] = Blacklist.[duration]
Where
Blacklist.[date] is null
Or
(Blacklist.[date] is not null AND Phonecalls.[Status] == 'Transfer')
CodePudding user response:
You can use row-numbering for this, along with a custom ordering. There is no need for any joins.
SELECT status, [user], duration, phonenumber, date
FROM (
SELECT *,
rn = ROW_NUMBER() OVER (PARTITION BY duration, date
ORDER BY CASE WHEN Status = 'Transfer' THEN 1 ELSE 2 END)
FROM view_InboundPhoneCalls
WHERE date >= DATEADD(day, -15, getdate())
) as Phonecalls
WHERE rn = 1
CodePudding user response:
The normal "trick" is to detect duplicates first. One of the easier ways is a CTE (Common Table Expression) along with the ROW_NUMBER() function.
Part One - Mark the duplicates
WITH
cte_Sorted_List
(
status, usertype, duration, phonenumber, dated, duplicate_check
)
AS
( -- only use required fields to speed up
SELECT status, user, duration, phonenumber, date,
-- marks depend on correct columns!
Row_Number() OVER
( -- sort over relevant columns to show
PARTITION BY user, phonenumber, date, duration
-- with correct sort order
-- bit of hack: As T comes after R
-- logic: mark records to show as row number 1 in duplicate list
ORDER BY status DESC
) AS duplicate_check
FROM view_InboundPhoneCalls
-- and lose all unnecessary data
WHERE date>=DATEADD(dd, -15, getdate())
)
Part two - show relevant rows
SELECT
status, usertype, duration, phonenumber, dated
FROM
cte_Sorted_List
WHERE
Duplicate_Check = 1
;
First CTE extracts required fields in single pass, then that data only is used for output.