Home > Enterprise >  Query with index incredibly slow on Sort
Query with index incredibly slow on Sort

Time:03-20

I have a database table with about 3.5 million rows. The table holds contract data records, with an amount, a date, and some IDs related to other tables (VendorId, AgencyId, StateId), this is the database table:

CREATE TABLE [dbo].[VendorContracts]
(
    [Id] [uniqueidentifier] NOT NULL,   
    [ContractDate] [datetime2](7) NOT NULL,
    [ContractAmount] [decimal](19, 4) NULL, 
    [VendorId] [uniqueidentifier] NOT NULL,
    [AgencyId] [uniqueidentifier] NOT NULL,
    [StateId] [uniqueidentifier] NOT NULL,

    [CreatedBy] [nvarchar](max) NULL,
    [CreatedDate] [datetime2](7) NOT NULL,
    [LastModifiedBy] [nvarchar](max) NULL,
    [LastModifiedDate] [datetime2](7) NULL,
    [IsActive] [bit] NOT NULL,

    CONSTRAINT [PK_VendorContracts] 
        PRIMARY KEY CLUSTERED ([Id] ASC)
                WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, 
                      OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

I have a page on my site where the user can filter a paged grid by VendorId and ContractDate, and sort by the ContractAmount or ContractDate. This is the query that EF Core produces when sorting by ContractAmount for this particular vendor that has over a million records:

DECLARE @__vendorId_0 uniqueIdentifier = 'f39c7198-b05a-477e-b7bc-cb189c5944c0';
DECLARE @__startDate_1 datetime2 = '2017-01-01T07:00:00.0000000';
DECLARE @__endDate_2 datetime2 = '2018-01-02T06:59:59.0000000';
DECLARE @__p_3 int = 0;
DECLARE @__p_4 int = 50;

SELECT [v].[Id], [v].[AdminFee], [v].[ContractAmount], [v].[ContractDate], [v].[PONumber], [v].[PostalCode], [v].[AgencyId], [v].[StateId], [v].[VendorId]
FROM [VendorContracts] AS [v]
WHERE (([v].[VendorId] = @__vendorId_0) AND ([v].[ContractDate] >= @__startDate_1)) AND ([v].[ContractDate] <= @__endDate_2)
ORDER BY [v].[ContractAmount] ASC
OFFSET @__p_3 ROWS FETCH NEXT @__p_4 ROWS ONLY

When I run this, it takes 50s, whether sorting ASC or DESC or offsetting by thousands, it's always 50s.

If I look at my Execution Plan, I see that it does use my index, but the Sort Cost is what's making the query take so long

enter image description here

This is my index:

CREATE NONCLUSTERED INDEX [IX_VendorContracts_VendorIdAndContractDate] ON [dbo].[VendorContracts]
(
    [VendorId] ASC,
    [ContractDate] DESC
)
INCLUDE([ContractAmount],[AdminFee],[PONumber],[PostalCode],[AgencyId],[StateId]) 
WITH (STATISTICS_NORECOMPUTE = OFF, DROP_EXISTING = OFF, ONLINE = OFF, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF)

The strange thing is that I have a similar index for sorting by ContractDate, and that one returns results in less than a second, even on the vendor that has millions of records.

Is there something wrong with my index? Or is sorting by a decimal data type just incredibly intensive?

CodePudding user response:

You have an index that allows the

VendorId = @__vendorId_0 and ContractDate BETWEEN @__startDate_1 AND @__endDate_2

predicate to be seeked exactly.

enter image description here

SQL Server estimates that 6,657 rows will match this predicate and need to be sorted so it requests a memory grant suitable for that amount of rows.

In reality for the parameter values where you see the problem nearly half a million are sorted and the memory grant is insufficient and the sort spills to disc.

50 seconds for 10,299 spilled pages does still sound unexpectedly slow but I assume you may well be on some very low SKU in Azure SQL Database?

Some possible solutions to resolve the issue might be to

  1. Force it to use an execution plan that is compiled for parameter values with your largest vendor and wide date range (e.g. with OPTIMIZE FOR hint). This will mean an excessive memory grant for smaller vendors though which may mean other queries have to incur memory grant waits.
  2. Use OPTION (RECOMPILE) so every invocation is recompiled for the specific parameter values passed. This means in theory every execution will get an appropriate memory grant at the cost of more time spent in compilation.
  3. Remove the need for a sort at all. If you have an index on VendorId, ContractAmount INCLUDE (ContractDate) then the VendorId = @__vendorId_0 part can be seeked and the index read in ContractAmount order. Once 50 rows have been found that match the ContractDate BETWEEN @__startDate_1 AND @__endDate_2 predicate then query execution can stop. SQL Server might not choose this execution plan without hints though.

I'm not sure how easy or otherwise it is to apply query hints through EF but you could look at forcing a plan via query store if you manage to get the desired plan to appear there.

  • Related