I have a query that calculates wait time for each record from the Transactions
table and calculates SUM and MAX wait time for each of the group below based on receivedDate
and claimedDt
.
Here is a basic query:
SELECT
'2022-06-01' as reportDate,
waitTimeSubQuery.currentAssignedQueueId,
waitTimeSubQuery.queueAccessPointId,
waitTimeSubQuery.queueName AS QueueName,
waitTimeSubQuery.queueReportCategory,
waitTimeSubQuery.queuePriority,
waitTimeSubQuery.queueOrganizationHierarchyId,
COUNT(waitTimeSubQuery.id) AS totalCasesWaiting,
SUM(CASE WHEN waitTimeSubQuery.waitTimeMinutes > 0 THEN waitTimeSubQuery.waitTimeMinutes ELSE 0 END) AS sumWaitTimeMinutes,
MAX(CASE WHEN waitTimeSubQuery.waitTimeMinutes > 0 THEN waitTimeSubQuery.waitTimeMinutes ELSE 0 END) AS maxWaitTimeMinutes
FROM (SELECT
id,
currentAssignedQueueId,
queueAccessPointId,
queueName,
queueReportCategory,
queuePriority,
queueOrganizationHierarchyId,
(CASE WHEN (receivedDateUTC > '0001-01-01T00:00:00Z' OR receivedDate > '0001-01-01T00:00:00Z') AND (appointmentDT IS NULL OR appointmentDT < '2022-09-28T12:58:47')
THEN CAST(DateDiff(MINUTE,
CASE WHEN receivedDateUTC > '0001-01-01T00:00:00Z'
THEN CONCAT(SUBSTRING(CAST(receivedDateUTC AS VARCHAR), 0, 19), 'Z')
ELSE TRY_CAST(CONCAT(CONCAT(SUBSTRING(CAST(receivedDate AS VARCHAR), 0, 10), SUBSTRING(CAST(createdDT AS VARCHAR), 10, 9)), 'Z') AS DATETIME2) END,
CASE WHEN claimedDT > '0001-01-01T00:00:00Z' AND claimedDT < '2022-06-01T23:59:00Z' AND transactionStatus != 'WaitingAssignment'
THEN CONCAT(SUBSTRING(CAST(claimedDT AS VARCHAR), 0, 19), 'Z')
ELSE '2022-06-01T23:59:00Z' END
) AS BIGINT)
ELSE 0 END) AS waitTimeMinutes
FROM @transactionsList
WHERE receivedDate <= '2022-06-01T00:00:00'
AND (claimedDT >= '2022-06-02T00:00:00' OR transactionStatus = 'WaitingAssignment')) waitTimeSubQuery
GROUP BY waitTimeSubQuery.currentAssignedQueueId,
waitTimeSubQuery.queueAccessPointId,
waitTimeSubQuery.queueName,
waitTimeSubQuery.queueReportCategory,
waitTimeSubQuery.queuePriority,
waitTimeSubQuery.queueOrganizationHierarchyId
I want to calculate statistics for each day for the period of 30 days and the only difference is dates used for calculating waitTimeMinutes
(based on endDate
, claimedDt
, receivedDate
) and filtering by receivedDate
and claimedDt
.
I tried to save sub-set of the Transactions data into a table variable and reuse it in multiple select queries to get the statistics for each day, but this script runs too slow
Here is the code:
DECLARE @transactionsList TABLE (
id UNIQUEIDENTIFIER,
currentAssignedQueueId UNIQUEIDENTIFIER,
queueAccessPointId UNIQUEIDENTIFIER,
queueName VARCHAR(100),
queueReportCategory VARCHAR(100),
queuePriority INT,
queueOrganizationHierarchyId UNIQUEIDENTIFIER,
receivedDate DATE,
claimedDT DATE,
transactionStatus VARCHAR(100),
receivedDateUTC DATE,
appointmentDT DATE,
createdDT DATE)
INSERT INTO @transactionsList
SELECT
Transactions.id,
Transactions.currentAssignedQueueId,
Queues.accessPointId as queueAccessPointId,
Queues.name as queueName,
Queues.reportCategory as queueReportCategory,
Queues.priority as queuePriority,
Queues.organizationHierarchyId as queueOrganizationHierarchyId,
Transactions.receivedDate,
Transactions.claimedDT,
Transactions.transactionStatus,
Transactions.receivedDateUTC,
Transactions.appointmentDT,
Transactions.createdDT
FROM Transactions
LEFT JOIN Queues ON Transactions.currentAssignedQueueId = Queues.Id
WHERE Transactions.consumerId = '66458f4a-b3d4-4f80-93d4-5aa3ea123249'
AND Transactions.isActive = 1
AND Transactions.receivedDate <= '2022-06-30T00:00:00'
AND (Transactions.claimedDT >= '2022-06-02T00:00:00' OR Transactions.transactionStatus = 'WaitingAssignment')
--SELECT COUNT(*) FROM @transactionsList
-- 2022-06-01
SELECT
'2022-06-01' as reportDate,
waitTimeSubQuery.currentAssignedQueueId,
waitTimeSubQuery.queueAccessPointId,
waitTimeSubQuery.queueName AS QueueName,
waitTimeSubQuery.queueReportCategory,
waitTimeSubQuery.queuePriority,
waitTimeSubQuery.queueOrganizationHierarchyId,
COUNT(waitTimeSubQuery.id) AS totalCasesWaiting,
SUM(CASE WHEN waitTimeSubQuery.waitTimeMinutes > 0 THEN waitTimeSubQuery.waitTimeMinutes ELSE 0 END) AS sumWaitTimeMinutes,
MAX(CASE WHEN waitTimeSubQuery.waitTimeMinutes > 0 THEN waitTimeSubQuery.waitTimeMinutes ELSE 0 END) AS maxWaitTimeMinutes
FROM (SELECT
id,
currentAssignedQueueId,
queueAccessPointId,
queueName,
queueReportCategory,
queuePriority,
queueOrganizationHierarchyId,
(CASE WHEN (receivedDateUTC > '0001-01-01T00:00:00Z' OR receivedDate > '0001-01-01T00:00:00Z') AND (appointmentDT IS NULL OR appointmentDT < '2022-09-28T12:58:47')
THEN CAST(DateDiff(MINUTE,
CASE WHEN receivedDateUTC > '0001-01-01T00:00:00Z'
THEN CONCAT(SUBSTRING(CAST(receivedDateUTC AS VARCHAR), 0, 19), 'Z')
ELSE TRY_CAST(CONCAT(CONCAT(SUBSTRING(CAST(receivedDate AS VARCHAR), 0, 10), SUBSTRING(CAST(createdDT AS VARCHAR), 10, 9)), 'Z') AS DATETIME2) END,
CASE WHEN claimedDT > '0001-01-01T00:00:00Z' AND claimedDT < '2022-06-01T23:59:00Z' AND transactionStatus != 'WaitingAssignment'
THEN CONCAT(SUBSTRING(CAST(claimedDT AS VARCHAR), 0, 19), 'Z')
ELSE '2022-06-01T23:59:00Z' END
) AS BIGINT)
ELSE 0 END) AS waitTimeMinutes
FROM @transactionsList
WHERE receivedDate <= '2022-06-01T00:00:00'
AND (claimedDT >= '2022-06-02T00:00:00' OR transactionStatus = 'WaitingAssignment')) waitTimeSubQuery
GROUP BY waitTimeSubQuery.currentAssignedQueueId,
waitTimeSubQuery.queueAccessPointId,
waitTimeSubQuery.queueName,
waitTimeSubQuery.queueReportCategory,
waitTimeSubQuery.queuePriority,
waitTimeSubQuery.queueOrganizationHierarchyId
And the part from '2022-06-01' repeats 30 times for each day from the range respectively.
Is there any possibility to optimize this script? For my data it runs nearly 49 seconds and the number of the records in the table is only 7kk and it's far way more on other environment.
---EDIT---
Here are the definitions for the tables Transactions
and Queues
CREATE TABLE [dbo].[Transactions] (
[id] UNIQUEIDENTIFIER NULL,
[accessPointId] UNIQUEIDENTIFIER NULL,
[isNonLobby] BIT NULL,
[printReceipt] BIT NULL,
[isExpressed] BIT NULL,
[isAssignedBySupervisor] BIT NULL,
[dateAssignedBySupervisor] VARCHAR (30) NULL,
[dateAssignedBySupervisorLocal] DATETIMEOFFSET (7) NULL,
[supervisorOverrideId] UNIQUEIDENTIFIER NULL,
[supervisorId] UNIQUEIDENTIFIER NULL,
[isNewHousehold] BIT NULL,
[householdId] UNIQUEIDENTIFIER NULL,
[receivedDate] DATE NULL,
[receivedDateUTC] DATETIME2 (7) NULL,
[transactionStatus] VARCHAR (20) NULL,
[claimedDT] DATETIME2 (7) NULL,
[claimedDTLocal] DATE NULL,
[initiallyClaimedById] UNIQUEIDENTIFIER NULL,
[completedDT] DATETIME2 (7) NULL,
[appointmentDT] DATETIME2 (7) NULL,
[completedDTLocal] DATE NULL,
[completedById] UNIQUEIDENTIFIER NULL,
[remarks] VARCHAR (MAX) NULL,
[currentAssignedQueueId] UNIQUEIDENTIFIER NULL,
[currentAssignedUserId] UNIQUEIDENTIFIER NULL,
[preTriageTransactionId] UNIQUEIDENTIFIER NULL,
[isAddClaim] BIT NULL,
[receiptId] UNIQUEIDENTIFIER NULL,
[ticketNumber] VARCHAR (100) NULL,
[deliNumber] VARCHAR (100) NULL,
[tasksCount] BIGINT NULL,
[projectedWaitTime] BIGINT NULL,
[actualWaitTimeMinutes] BIGINT NULL,
[assignWaitTime] BIGINT NULL,
[transactionTimeMinutes] BIGINT NULL,
[triageTimeMinutes] BIGINT NULL,
[outcomeId] UNIQUEIDENTIFIER NULL,
[outcomeReasonId] UNIQUEIDENTIFIER NULL,
[outcomeType] VARCHAR (50) NULL,
[dueDate] VARCHAR (30) NULL,
[dueDateLocal] DATETIMEOFFSET (7) NULL,
[isDueDateToday] VARCHAR (30) NULL,
[isUnknownHousehold] BIT NULL,
[activityId] UNIQUEIDENTIFIER NULL,
[programId] UNIQUEIDENTIFIER NULL,
[reviewMonthDueDate] VARCHAR (40) NULL,
[reviewMonthDueDateLocal] DATETIMEOFFSET (7) NULL,
[officeId] UNIQUEIDENTIFIER NULL,
[isOnceManuallyPaused] BIT NULL,
[pkey] VARCHAR (100) NULL,
[isActive] BIT NULL,
[consumerId] UNIQUEIDENTIFIER NULL,
[organizationName] VARCHAR (50) NULL,
[modifiedBy] UNIQUEIDENTIFIER NULL,
[modifiedDT] DATETIME2 (7) NULL,
[modifiedDTLocal] DATE NULL,
[createdBy] UNIQUEIDENTIFIER NULL,
[createdDT] DATETIME2 (7) NULL,
[createdDTLocal] DATE NULL,
[_ts] BIGINT NULL,
[type] VARCHAR (50) NULL,
[timezoneId] VARCHAR (50) NULL
);
CREATE TABLE [dbo].[Queues] (
[id] UNIQUEIDENTIFIER NULL,
[name] VARCHAR (255) NULL,
[accessPointId] UNIQUEIDENTIFIER NULL,
[organizationHierarchyId] UNIQUEIDENTIFIER NULL,
[assignedOrganizationHierarchyLevel] VARCHAR (20) NULL,
[bundlingGroup] BIGINT NULL,
[reportCategory] VARCHAR (100) NULL,
[priority] INT NULL,
[businessProcessThreshold] BIGINT NULL,
[calculateProjectedWaitTime] BIT NULL,
[waitTimeUnits] VARCHAR (10) NULL,
[hasCarryOver] BIT NULL,
[defaultTransactionTimeMinutes] BIGINT NULL,
[latestQueueMetricId] UNIQUEIDENTIFIER NULL,
[isAppointment] BIT NULL,
[consumerId] UNIQUEIDENTIFIER NULL,
[organizationName] VARCHAR (50) NULL,
[modifiedBy] UNIQUEIDENTIFIER NULL,
[modifiedDT] DATE NULL,
[createdBy] UNIQUEIDENTIFIER NULL,
[createdDT] DATE NULL,
[_ts] BIGINT NULL
);
And an index for the Transactions
table (this is the only one index for this table and there are no indexes for the Queues
table):
CREATE NONCLUSTERED INDEX [nci_wi_Transactions_EBAFDE3A7C2969265E76135FBA69188D] ON [dbo].[Transactions]
(
[consumerId] ASC,
[isActive] ASC,
[receivedDate] ASC
)
INCLUDE([appointmentDT],[claimedDT],[createdDT],[currentAssignedQueueId],[id],[receivedDateUTC],[transactionStatus]) WITH (STATISTICS_NORECOMPUTE = OFF, DROP_EXISTING = OFF, ONLINE = OFF, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
GO
Here is the link on GitHub for the execution plan and the full query
https://github.com/sskotliar/query_optimization
CodePudding user response:
Option - 1
You can improve the performance of your query by using a CTE
. How does CTE
work? The query you write inside the WITH
command is executed only once, and its result is stored as a temporary table. This allows you to execute your large query with multiple join in it once. In the remaining cases, only its result is used. But as I understood you call your query with different filters. If this is true, so please see Option 2.
Option - 2
If the result of your repeatable query is large, and you call this query with different filters, it is recommended that you, insert result of repeatable query into temp table, then create needed indexes for this temp table. Then you can use that table as much as you need. Since your table has indexes so your filters will be works high performance.
For each of the options mentioned above, it is recommended that you view Query Plan. Because the query plan may change from the DB side depending on the count of records in the tables.