Home > Mobile >  How can I make my SQL Server query efficient enough to not time out?
How can I make my SQL Server query efficient enough to not time out?

Time:07-23

I am working in an environment where I have a view, [stages].[jobStages], that contains the JobNumber, Region, and the the dates it completes each stage of the job. From this view I have created another view, [forecast].[DurationTable], containing the time between the stages {i.e. Duration1 = Stage2 - Stage1}. Example below:

JobNumber Region Stage1 Stage2 Stage3 Duration1 Duration2
12345 Houston 3/22/2022 4/18/2022 5/25/2022 27 37
42421 Houston 10/09/2021 11/29/2021 1/6/2022 51 38
98984 Houston 1/1/2022 2/2/2022 3/3/2022 32 29
78900 Dallas 4/23/2022 5/17/2022 5/29/2022 24 12

Now I need to create another view that finds the average time between each stage, by region, only using instances of that stage that have been completed within the past 4 months. An example of what this looks like (assuming today's date to be June 1st 2022):

JobNumber Region Stage1 Stage2 Stage3 Duration1 Duration2 AvgDuration1 AvgDuration2
12345 Houston 3/22/2022 4/18/2022 5/25/2022 27 37 29.5 33
42421 Houston 10/09/2021 11/29/2021 1/6/2022 51 38 29.5 33
98984 Houston 1/1/2022 2/2/2022 3/3/2022 32 29 29.5 33
78900 Dallas 4/23/2022 5/17/2022 5/29/2022 24 12 24 12

In the above example AvgDuration1 for the Houston jobs only considers the first and third record as Stage2 was completed within the last 4 months of todays date (June 1st) on these records so it averages Duration1 of 27 and 32.

I have created a query resembling the one below however it times out before execution. My actual query must AVG() 8 times as there are more stages than this example.

SELECT [JobNumber]
,[Region]
,[Stage1]
,[Stage2]
,[Stage3]
,[Duration1]
,[Duration2]
,(
    SELECT AVG(Duration1)
    FROM [forecast].[DurationTable]
    WHERE DATEDIFF(month, Stage2, GETDATE()) <= 4
    GROUP BY Region
) AS AvgDuration1
,(
    SELECT AVG(Duration2)
    FROM [forecast].[DurationTable]
    WHERE DATEDIFF(month, Stage3, GETDATE()) <= 4
    GROUP BY Region
) AS AvgDuration2
FROM [forecast].[DurationTable]

I am not very experienced with SQL but imagine that calling the [forecast].[DurationTable] and performing a calculation so many times is load intensive. I have tried forming [DurationTable] as a CTE but this did not increase performance. How can I achieve my result table in a reasonable amount of time (sub 15 minutes)?

CodePudding user response:

Your current query is invalid due to the GROUP BY Region as it would raise an error if more than one region exists - I assume that there is supposed to be a correlation on Region also.

An improvement is likely to be

SELECT [JobNumber],
       [Region],
       [Stage1],
       [Stage2],
       [Stage3],
       [Duration1],
       [Duration2],
       AvgDuration1 = AVG(CASE WHEN DATEDIFF(month, Stage2, GETDATE()) <= 4 THEN Duration1 END) OVER (PARTITION BY Region),
       AvgDuration2 = AVG(CASE WHEN DATEDIFF(month, Stage3, GETDATE()) <= 4 THEN Duration2 END) OVER (PARTITION BY Region)
FROM   [forecast].[DurationTable] 

Building views on top of views is often problematic though.

It is unclear how these Stage and Duration columns in the view are derived but you may end up with something much better performing if you write your queries against the base tables rather than the view.

  • Related