Suppose, I have following table structure:
CREATE TABLE [dbo].[Users]
(
[Id] [INT] IDENTITY(1,1) NOT NULL,
[Age] [INT] NULL,
[CreationDate] [DATETIME] NOT NULL,
[DisplayName] [NVARCHAR](40) NOT NULL,
[Location] [NVARCHAR](100) NULL,
CONSTRAINT [PK_Users_Id]
PRIMARY KEY CLUSTERED ([Id] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
As we can see in the above code, the [Id]
column is the primary key clustered index column.
Now, I am creating the following two non-clustered indexes:
-- Index #1
CREATE INDEX IX_CreationDate_Id_DisplayName_Age1
ON dbo.Users(CreationDate, Id)
INCLUDE (DisplayName, Age);
-- Index #2
CREATE INDEX IX_CreationDate_Id_DisplayName_Age2
ON dbo.Users(CreationDate)
INCLUDE (DisplayName, Age);
-- Index #3
CREATE INDEX IX_CreationDate_Id_DisplayName_Age3
ON dbo.Users(CreationDate)
INCLUDE (ID, DisplayName, Age);
My questions is as follows: we know that the Id
column is the primary key clustered index key column, so it will always be part of all non clustered indexes, whether it's added as a non-clustered index key or not.
Then in what scenarios do we need to add the Id
column to be part of the non-clustered index key explicitly, as its added in Index #1. In all three cases of Index creation, where Id
columns will be stored in the B-Tree structure of the non-clustered index?
CodePudding user response:
As you noted, since the ID is the primary key, it is always implicitly included as a mechanism to link the non-clustered index back to the table's clustered index. There is no reason to list it as an included column.
There could be a case, as in your Index #1, where you want to use the PK as part of a composite key to sort the order that the non-clustered index has. However, in your case, if creation dates are typically unique values, it would serve no additional value in terms of the data's sorted order.
CodePudding user response:
In any index, SQL Server requires that the internal structure of the index should be unique. When a non-clustered index is created on columns that are not otherwise guaranteed to be unique, a uniqifier is added. In normal circumstances, this is the clustering key, as that needs to be there anyway.
So adding the clustering key as the last key column doesn't add any usefulness, nor does adding it as an INCLUDE
column. (And the order of INCLUDE
columns never matters anyway.)
However, adding it as a non-last key column means that all columns after that become pointless, as once the B-tree reaches that key level it is unique anyway.
It can still be useful to include the clustering key in the index definition, if there is any reason to suspect that a DBA may later change the clustering key, and you wish for that index to continue to be sorted according to the clustering key.