Performance optimization Row_Number() over partition by:-CodePudding

I have been trying to optimize performance of a stored procedure that has left join between a table (Table1) and a subquery. It gets slow when the records in the department table increase; I think, it is due to the ROW_NUMBER function. It works fine with lower number of records in the Department table. There is a left join between Table1 and a subquery at the end of the stored procedure.

Below is sample tables of Table1, Employee and Department, involved in the query that is running slow.

CREATE TABLE [Table1]( 
   [ID] [int] NOT NULL PRIMARY KEY, 
   [Name] VARCHAR(250) NOT NULL,
   [DepartmentID] int 
) ON [PRIMARY] 

INSERT [Table1] ([ID], [Name], [DepartmentID])  
VALUES (1, N'A', 1), 
(2, N'D', 2), 
(3, N'C', 3), 
(4, N'E', 4),
(5, N'D', 5),  
(6, N'A', 6),
(7, N'B', 7)  
GO 

CREATE TABLE [Department]( 
   [DepartmentID] [int] NOT NULL PRIMARY KEY, 
   [Name] VARCHAR(250) NOT NULL, 
) ON [PRIMARY] 

INSERT [Department] ([DepartmentID], [Name])  
VALUES (1, N'Engineering'), 
(2, N'Administration'),  
(3, N'Sales'), 
(4, N'Marketing'), 
(5, N'Finance') 
GO 

CREATE TABLE [Employee]( 
   [EmployeeID] [int] NOT NULL PRIMARY KEY, 
   [FirstName] VARCHAR(250) NOT NULL, 
   [LastName] VARCHAR(250) NOT NULL, 
   [DepartmentID] [int] NOT NULL REFERENCES [Department](DepartmentID), 
) ON [PRIMARY] 
GO

INSERT [Employee] ([EmployeeID], [FirstName], [LastName], [DepartmentID]) 
VALUES (1, N'Orlando', N'Gee', 1 ), 
(2, N'Keith', N'Harris', 2 ), 
(3, N'Donna', N'Carreras', 3 ), 
(4, N'Janet', N'Gates', 3 )

Sample query that is running slow.

SELECT *
FROM Table1 AS t
    LEFT JOIN (SELECT D.Name
                     ,E.DepartmentID
                     ,ROW_NUMBER() OVER (PARTITION BY E.DepartmentID
                                         ORDER BY D.DepartmentID ASC
                                                 ,CASE WHEN D.Name IS NULL
                                                       THEN 1
                                                       ELSE 0
                                                       END ASC
                                                 ,D.Name ASC
                                        ) AS ord_index
               FROM Department AS D
                   INNER JOIN Employee AS E
                       ON D.DepartmentID = E.DepartmentID
              ) AS x
        ON x.DepartmentID = t.DepartmentID
WHERE x.ord_index = 1 OR x.ord_index IS NULL

I tried adding indexes per the suggestion of tuning advisor, but it doesn't improve the performance. I was also trying to use APPLY, but couldn't get it clear due to the ROW_NUMBER function. I have been trying to go through the related titles on the platform though.

I appreciate any guide in optimizing such query.

CodePudding user response：

You could refactor the query to use an outer apply which may result in a better execution plan (depending on the supporting indexes) such as :

select * from Table1 t 
outer apply (
    select D.Name,
        Row_Number() over (partition by E.DepartmentID order by D.DepartmentID asc, 
        case when D.Name is null then 1 
        else 0 end asc, D.Name asc) as ord_index 
    from Department D 
    join Employee E on D.DepartmentID = E.DepartmentID
    where E.DepartmentID = t.DepartmentID 
) x 
where x.ord_index = 1 or x.ord_index is null

See DB<>Fiddle showing better plan

CodePudding user response：

In the query, the section that is difficult to index is CASE WHEN D.Name IS NULL THEN 1 ELSE 0 END. This can, however, be indexed if you are willing to add an extra generated column in the Department table. For example:

alter table Department add name_flag as 
CASE WHEN D.Name IS NULL THEN 1 ELSE 0 END;

create index ix1 on Department (DepartmentID, name_flag, name);

With the the new column in place and the index, the query can be rephrased as:

SELECT * 
FROM Table1 t 
LEFT JOIN (
  SELECT D.Name, D.DepartmentID,  
    ROW_NUMBER() OVER (
      PARTITION BY D.DepartmentID  ORDER BY name_flag, D.Name
    ) AS ord_index
  FROM Department D 
  INNER JOIN Employee E ON D.DepartmentID = E.DepartmentID
) x ON x.DepartmentID = t.DepartmentID and x.ord_index = 1