Home > Mobile >  Is partitioning column required in the clustered index in SQL Server
Is partitioning column required in the clustered index in SQL Server

Time:11-25

I have a table with these indexes:

pk_id_sales PRIMARY KEY (id) -> Clustered unique index
uk_sales_id UNIQUE(sales_id -> Non clustered unique index
uk_sales_date_party_name (sales_date, party_name) -> Non clustered, non unique index

I want to partition this table on the column sales_date.

  1. Should I include sales_date into the clustered index to get the benefits of partitioning? Is this an optional one? What should be the factors to be considered to make this decision if it is an optional one?

  2. What should be the order of columns in the clustered index If I add sales_date? Should it be (id, sales_date) or (sales_date, id)? What is the role of order here?

  3. Will the order of columns in the index make any performance impact in this case?

  4. If we include the partition column in the query, will the partition elimination happen always regardless of the indexes we have? (Eg: I already have a unique non clustered index on the sales_id (it doesn't contain sales_date). If I make a query with sales_id and sales_date in the where clause, will the partition elimination happen?)

  5. Please share if there is a comprehensive write up or video that will help to gain a fair understanding on the above given concepts.

Any response will be appreciated. I can share more details if required.

I tried the following scenarios on an existing empty table.In both cases, the new records are getting inserted to the respective partitions and partition elimination is happening properly (Found it based on the actual execution plan in azure data studio)

SCENARIO 1

I followed the below given steps based on a tutorial. I don't know are we performing the 4th step.

  1. Drop the existing clustered index on ID
  2. Create the a new non clustered index on ID
  3. Create a clustered index on sales_date
  4. Drop the clustered index on `sales_date'

SCENARIO 2

Based on another tutorial, I tried the following.

I followed the below given steps based on a tutorial. I don't know are we performing the 4th step.

  1. Drop the existing clustered index on ID
  2. Create the a new non clustered index on ID
  3. Create a clustered index on sales_date

CodePudding user response:

For your first question, the partitioning column is required to be specified explicitly as a key column for all unique indexes. Furthermore, SQL Server will automatically add the partitioning column to clustered index keys if not already specified.

The partitioning column is automatically added as an included column in non-unique non-clustered indexes when not already a key or included column.

EDIT:

For this question asked in comment:

The existing clustered index on my table is id (It is IDentical and auto incremented). I want to partition the table based on sales_date. My understanding is that we need to add sales_date to the clustered index. In the examples I saw on web, they are adding it as a second part of the clustered index, ie, (id, sales_date). But for me, it looks like (sales_date,id) will be more helpful as id is unique and it will not help to improve performance.

It depends on your queries. The partitioning column must be specified to eliminate parttions and the leftmost key column must be specified to perform an index seek.

With unique clustered index key (id,sales_date) and no other indexes:

  • WHERE id = 1 will perform an index seek against every partition to find the single row.
  • WHERE sales_date = '20221114' will perform a full scan of single partition containing the date and return only rows matching the date.
  • WHERE id = 1 AND sales_date = '20221114' will perform a seek against only the single partition containing the date and touch the single row.

With unique clustered index key (sales_date,id):

  • WHERE id = 1 will full scan every partition to find the single row.
  • WHERE sales_date = '20221114' will perform an index seek on only the partition containing the date and touch only rows that qualify.
  • WHERE id = 1 AND sales_date = '20221114' will perform and index seek only the partition containing the date and touch only the single row.
  • Related