Home > Enterprise >  Should we use column used in order by clause as include column of NC INDEX
Should we use column used in order by clause as include column of NC INDEX

Time:10-02

I have created a NC index and used "LastUpdated" as a include column in index but "LastUpdated" is being used in order by clause in my query . Should we use column used in order by clause as include column of NC INDEX?

   CREATE NONCLUSTERED INDEX [IXNI__symboltab__Status_active_Symbol] 
   ON symboltab (Status,active,Symbol)
   INCLUDE (LastUpdated)

    SELECT TOP 10000 symbol,LastUpdated 
    from symboltab
    with (nolock index = IXNI__symboltab__Status_active_Symbol)
    WHERE 
        active = 1 
        AND Status = 999 --999 is default
        AND Symbol NOT LIKE '/%'
        AND Symbol NOT LIKE '%#%'
        AND Symbol NOT LIKE '!%'
    ORDER BY LastUpdated ASC

CodePudding user response:

As far as I know, columns listed in the INCLUDE clause are not part of the actual B-tree index, but rather appear only in the leaf nodes. A consequence of this is that for your current index, the leaf nodes would generally not be sorted by the LastUpdated values. The values would be there in the leaf nodes, but there is no guarantee of any sort. Therefore, if you want to give your index a chance to cover all parts of your query, you should move LastUpdated into the actual index structure:

CREATE NONCLUSTERED INDEX [IXNI__symboltab__Status_active_Symbol]
    ON symboltab (Status, active, Symbol, LastUpdated);

CodePudding user response:

The best index for this query is actually

CREATE NONCLUSTERED INDEX [IXNI__symboltab__Status_active_Symbol]
    ON symboltab (Status, active, LastUpdated) INCLUDE (Symbol);

-- alternatively

CREATE NONCLUSTERED INDEX [IXNI__symboltab__Status_active_Symbol]
    ON symboltab (active, Status, LastUpdated) INCLUDE (Symbol);

The reason is that both active and Status have equality predicates, and can be seeked directly, therefore they should come first (in either order).

Symbol cannot be seeked, as it has multiple inequality predicates. Even if it had only one, it would still mess up the final sort. Therefore it must go in the INCLUDE, which is not part of the index key.

Finally LastUpdated, this means that the data is fully sorted, and does not need an extra sort.

You can see the difference in this db<>fiddle


Side notes:

  • If you get the indexing right, you do not need an index hint.
  • Do not use NOLOCK unless you really know what you're doing. It's not a go-faster switch, it's a give-incorrect-results switch.

CodePudding user response:

Select Top or Fetch First is actually a filter even if order by is absent, although there is usually order by clause. The index should be:

CREATE NONCLUSTERED INDEX [IXNI__symboltab__Status_active_Symbol] 
   ON symboltab (LastUpdated,Status,active,Symbol)

And you are done searching the rows with your filters as fast as possible. Note that if another similar query does not filter 'active' for example but does filter 'symbol' and the rest, only LastUpdated Status part of the index will be used. So if you have more queries, study a better order of columns into the index. Put column left-wise based on their usage, the more usage the most left-wise.

Included columns now are only used for reading, not for searching. If part of the index, no included is needed. Since you select symbol,LastUpdated that are both part of the index, no included columns are needed. If you add another column like 'FirstUpdated' tomorrow and you do not filter it (just for display purpose on select list) then you can add this particular column into INCLUDE to make your query faster because when the rows are found through the index included column will help read the information from the index itself. Else it will read the found rows to get that new column.

  • Related