The following query groups Snippets
by ChannelId
and returns an UnreadSnippetCount
.
To determine the UnreadSnippetCount
, Channel
is joined onto ChannelUsers
to fetch the date that the User
last read the Channel
and it uses this LastReadDate
to limit the count to rows where the snippet was created after the user last read the channel.
SELECT c.Id, COUNT(s.Id) as [UnreadSnippetCount]
FROM Channels c
INNER JOIN ChannelUsers cu
ON cu.ChannelId = c.Id
LEFT JOIN Snippets s
ON cu.ChannelId = s.ChannelId
AND s.CreatedByUserId <> @UserId
WHERE cu.UserId = @UserId
AND (cu.LastReadDate IS NULL OR s.CreatedDate > cu.LastReadDate)
AND c.Id IN (select value from STRING_SPLIT(@ChannelIds, ','))
GROUP BY c.Id
The query works well logically but for Channels
that have a large number of Snippets
(97691), the query can take 10 minutes or more to return.
The following index is created:
CREATE NONCLUSTERED INDEX [IX_Snippets_CreatedDate] ON [dbo].[Snippets]
(
[CreatedDate] ASC
)WITH (STATISTICS_NORECOMPUTE = OFF, DROP_EXISTING = OFF, ONLINE = OFF, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
GO
Update:
Query execution plan (original query):
https://www.brentozar.com/pastetheplan/?id=B19sI105F
Update 2
Moving the where
clause into the join
as suggested:
SELECT c.Id, COUNT(s.Id) as [UnreadSnippetCount]
FROM Channels c
INNER JOIN ChannelUsers cu
ON cu.ChannelId = c.Id
LEFT JOIN Snippets s
ON cu.ChannelId = s.ChannelId
AND s.CreatedByUserId <> @UserId
AND s.CreatedDate > cu.LastReadDate
WHERE cu.UserId = @UserId
AND c.Id IN (select value from STRING_SPLIT(@ChannelIds, ',')
Produces this execution plan:
https://www.brentozar.com/pastetheplan/?id=HkqwFk0ct
Is there a better date comparison method I can use?
Update 3 - Solution
Index
CREATE NONCLUSTERED INDEX [IX_Snippet_Created] ON [dbo].[Snippets]
(ChannelId ASC, CreatedDate ASC) INCLUDE (CreatedByUserId);
Stored Proc
ALTER PROCEDURE [dbo].[GetUnreadSnippetCounts2]
(
@ChannelIds ChannelIdsType READONLY,
@UserId nvarchar(36)
)
AS
SET NOCOUNT ON
SELECT
c.Id,
COUNT(s.Id) as [UnreadSnippetCount]
FROM Channels c
JOIN @ChannelIds cid
ON cid.Id = c.Id
INNER JOIN ChannelUsers cu
ON cu.ChannelId = c.Id
AND cu.UserId = @UserId
JOIN Snippets s
ON cu.ChannelId = s.ChannelId
AND s.CreatedByUserId <> @UserId
AND (cu.LastReadDate IS NULL OR s.CreatedDate > cu.LastReadDate)
GROUP BY c.Id;
This gives the correct results logically and returns quickly.
Resulting execution plan:
https://www.brentozar.com/pastetheplan/?id=S1GwRCCcK
CodePudding user response:
There are a number of inefficiencies I can see in the query plan.
Using
STRING_SPLIT
means the compiler does not know how many values are being returned, or that they are unique, and the data type is mismatched. Ideally you would pass in a Table valued Parameter, however if you cannot do so then another solution is to dump them into a table variableDECLARE @tmp TABLE (Id int PRIMARY KEY); INSERT @tmp (Id) select value from STRING_SPLIT(@ChannelIds, ',')
You need better indexing on
Snippets
. I would suggest the followingCREATE NONCLUSTERED INDEX [IX_Snippet_Created] ON [dbo].[Snippets] (ChannelId ASC, CreatedDate ASC) INCLUDE (CreatedByUserId);
It doesn't make sense to place
CreatedByUserId
in the key, because it's an inequality. Keep it in theINCLUDE
As you have already been told, it's better if you move the conditions (for left-joined tables) to the
ON
clause. I don't know if you then still need thecu.LastReadDate IS NULL
check, I've left it in.I must say, I'm unclear your schema, but
INNER JOIN ChannelUsers cu
feels wrong here, perhaps it should be aLEFT JOIN
? I cannot say further without seeing your full setup and required output.
SELECT
c.Id,
COUNT(s.Id) as [UnreadSnippetCount]
FROM Channels c
JOIN @tmp t
ON t.Id = c.Id
INNER JOIN ChannelUsers cu
ON cu.ChannelId = c.Id
AND cu.UserId = @UserId
LEFT JOIN Snippets s
ON cu.ChannelId = s.ChannelId
AND s.CreatedByUserId <> @UserId
AND (cu.LastReadDate IS NULL OR s.CreatedDate > cu.LastReadDate)
GROUP BY c.Id;