Home > OS >  Date comparison slow for large number of joined rows
Date comparison slow for large number of joined rows

Time:12-22

The following query groups Snippets by ChannelId and returns an UnreadSnippetCount.

To determine the UnreadSnippetCount, Channel is joined onto ChannelUsers to fetch the date that the User last read the Channel and it uses this LastReadDate to limit the count to rows where the snippet was created after the user last read the channel.

    SELECT c.Id, COUNT(s.Id) as [UnreadSnippetCount] 
    FROM Channels c
    INNER JOIN ChannelUsers cu
        ON cu.ChannelId = c.Id
    LEFT JOIN Snippets s
        ON cu.ChannelId = s.ChannelId
        AND s.CreatedByUserId <> @UserId
    WHERE cu.UserId = @UserId
    AND (cu.LastReadDate IS NULL OR s.CreatedDate > cu.LastReadDate)
    AND c.Id IN (select value from STRING_SPLIT(@ChannelIds, ','))

    GROUP BY c.Id

The query works well logically but for Channels that have a large number of Snippets (97691), the query can take 10 minutes or more to return.

The following index is created:

CREATE NONCLUSTERED INDEX [IX_Snippets_CreatedDate] ON [dbo].[Snippets]
(
    [CreatedDate] ASC
)WITH (STATISTICS_NORECOMPUTE = OFF, DROP_EXISTING = OFF, ONLINE = OFF, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
GO

Update:

Query execution plan (original query):

https://www.brentozar.com/pastetheplan/?id=B19sI105F

Update 2

Moving the where clause into the join as suggested:

    SELECT c.Id, COUNT(s.Id) as [UnreadSnippetCount] 
    FROM Channels c
    INNER JOIN ChannelUsers cu
        ON cu.ChannelId = c.Id
    LEFT JOIN Snippets s
        ON cu.ChannelId = s.ChannelId
        AND s.CreatedByUserId <> @UserId
        AND s.CreatedDate > cu.LastReadDate
    WHERE cu.UserId = @UserId
    AND c.Id IN (select value from STRING_SPLIT(@ChannelIds, ',')

Produces this execution plan:

https://www.brentozar.com/pastetheplan/?id=HkqwFk0ct

Is there a better date comparison method I can use?

Update 3 - Solution

Index


CREATE NONCLUSTERED INDEX [IX_Snippet_Created] ON [dbo].[Snippets]
  (ChannelId ASC, CreatedDate ASC) INCLUDE (CreatedByUserId);

Stored Proc

ALTER PROCEDURE [dbo].[GetUnreadSnippetCounts2]
(
    @ChannelIds ChannelIdsType READONLY,
    @UserId nvarchar(36)
)
AS
    SET NOCOUNT ON

SELECT
    c.Id,
    COUNT(s.Id) as [UnreadSnippetCount] 
FROM Channels c
JOIN @ChannelIds cid
    ON cid.Id = c.Id
INNER JOIN ChannelUsers cu
    ON cu.ChannelId = c.Id
    AND cu.UserId = @UserId
JOIN Snippets s
    ON cu.ChannelId = s.ChannelId
    AND s.CreatedByUserId <> @UserId
    AND (cu.LastReadDate IS NULL OR s.CreatedDate > cu.LastReadDate)
GROUP BY c.Id;

This gives the correct results logically and returns quickly.

Resulting execution plan:

https://www.brentozar.com/pastetheplan/?id=S1GwRCCcK

CodePudding user response:

There are a number of inefficiencies I can see in the query plan.

  1. Using STRING_SPLIT means the compiler does not know how many values are being returned, or that they are unique, and the data type is mismatched. Ideally you would pass in a Table valued Parameter, however if you cannot do so then another solution is to dump them into a table variable

    DECLARE @tmp TABLE (Id int PRIMARY KEY);
    INSERT @tmp (Id)
    select value
    from STRING_SPLIT(@ChannelIds, ',')
    
  2. You need better indexing on Snippets. I would suggest the following

    CREATE NONCLUSTERED INDEX [IX_Snippet_Created] ON [dbo].[Snippets]
      (ChannelId ASC, CreatedDate ASC) INCLUDE (CreatedByUserId);
    

    It doesn't make sense to place CreatedByUserId in the key, because it's an inequality. Keep it in the INCLUDE

  3. As you have already been told, it's better if you move the conditions (for left-joined tables) to the ON clause. I don't know if you then still need the cu.LastReadDate IS NULL check, I've left it in.

  4. I must say, I'm unclear your schema, but INNER JOIN ChannelUsers cu feels wrong here, perhaps it should be a LEFT JOIN? I cannot say further without seeing your full setup and required output.

SELECT
  c.Id,
  COUNT(s.Id) as [UnreadSnippetCount] 
FROM Channels c
JOIN @tmp t
    ON t.Id = c.Id
INNER JOIN ChannelUsers cu
    ON cu.ChannelId = c.Id
    AND cu.UserId = @UserId
LEFT JOIN Snippets s
    ON cu.ChannelId = s.ChannelId
    AND s.CreatedByUserId <> @UserId
    AND (cu.LastReadDate IS NULL OR s.CreatedDate > cu.LastReadDate)
GROUP BY c.Id;
  • Related