I have been trying to optimize performance of a stored procedure that has left join between a table (Table1) and a subquery. It gets slow when the records in the department table increase; I think, it is due to the ROW_NUMBER function. It works fine with lower number of records in the Department table. There is a left join between Table1 and a subquery at the end of the stored procedure.
Below is sample tables of Table1, Employee and Department, involved in the query that is running slow.
CREATE TABLE [Table1](
[ID] [int] NOT NULL PRIMARY KEY,
[Name] VARCHAR(250) NOT NULL,
[DepartmentID] int
) ON [PRIMARY]
INSERT [Table1] ([ID], [Name], [DepartmentID])
VALUES (1, N'A', 1),
(2, N'D', 2),
(3, N'C', 3),
(4, N'E', 4),
(5, N'D', 5),
(6, N'A', 6),
(7, N'B', 7)
GO
CREATE TABLE [Department](
[DepartmentID] [int] NOT NULL PRIMARY KEY,
[Name] VARCHAR(250) NOT NULL,
) ON [PRIMARY]
INSERT [Department] ([DepartmentID], [Name])
VALUES (1, N'Engineering'),
(2, N'Administration'),
(3, N'Sales'),
(4, N'Marketing'),
(5, N'Finance')
GO
CREATE TABLE [Employee](
[EmployeeID] [int] NOT NULL PRIMARY KEY,
[FirstName] VARCHAR(250) NOT NULL,
[LastName] VARCHAR(250) NOT NULL,
[DepartmentID] [int] NOT NULL REFERENCES [Department](DepartmentID),
) ON [PRIMARY]
GO
INSERT [Employee] ([EmployeeID], [FirstName], [LastName], [DepartmentID])
VALUES (1, N'Orlando', N'Gee', 1 ),
(2, N'Keith', N'Harris', 2 ),
(3, N'Donna', N'Carreras', 3 ),
(4, N'Janet', N'Gates', 3 )
Sample query that is running slow.
SELECT *
FROM Table1 AS t
LEFT JOIN (SELECT D.Name
,E.DepartmentID
,ROW_NUMBER() OVER (PARTITION BY E.DepartmentID
ORDER BY D.DepartmentID ASC
,CASE WHEN D.Name IS NULL
THEN 1
ELSE 0
END ASC
,D.Name ASC
) AS ord_index
FROM Department AS D
INNER JOIN Employee AS E
ON D.DepartmentID = E.DepartmentID
) AS x
ON x.DepartmentID = t.DepartmentID
WHERE x.ord_index = 1 OR x.ord_index IS NULL
I tried adding indexes per the suggestion of tuning advisor, but it doesn't improve the performance. I was also trying to use APPLY, but couldn't get it clear due to the ROW_NUMBER function. I have been trying to go through the related titles on the platform though.
I appreciate any guide in optimizing such query.
CodePudding user response:
You could refactor the query to use an outer apply
which may result in a better execution plan (depending on the supporting indexes) such as :
select * from Table1 t
outer apply (
select D.Name,
Row_Number() over (partition by E.DepartmentID order by D.DepartmentID asc,
case when D.Name is null then 1
else 0 end asc, D.Name asc) as ord_index
from Department D
join Employee E on D.DepartmentID = E.DepartmentID
where E.DepartmentID = t.DepartmentID
) x
where x.ord_index = 1 or x.ord_index is null
See DB<>Fiddle showing better plan
CodePudding user response:
In the query, the section that is difficult to index is CASE WHEN D.Name IS NULL THEN 1 ELSE 0 END
. This can, however, be indexed if you are willing to add an extra generated column in the Department
table. For example:
alter table Department add name_flag as
CASE WHEN D.Name IS NULL THEN 1 ELSE 0 END;
create index ix1 on Department (DepartmentID, name_flag, name);
With the the new column in place and the index, the query can be rephrased as:
SELECT *
FROM Table1 t
LEFT JOIN (
SELECT D.Name, D.DepartmentID,
ROW_NUMBER() OVER (
PARTITION BY D.DepartmentID ORDER BY name_flag, D.Name
) AS ord_index
FROM Department D
INNER JOIN Employee E ON D.DepartmentID = E.DepartmentID
) x ON x.DepartmentID = t.DepartmentID and x.ord_index = 1