Home > Software engineering >  SQL Server - Combing two selects into one for particular logic?
SQL Server - Combing two selects into one for particular logic?

Time:07-13

I have the following SQL I would like to see if I can make one select instead of two? the first check gets the snapshot id then the second one uses this to only return the latest rows ignoring any rows before the snapshot. I know that we shouldn't be ordering by id(unpredictable) but that is the current code!

The code looks like this

DECLARE @UserId INT = 75;
DECLARE @lastSnapshotId INT;

SET @lastSnapshotId = (
    SELECT TOP 1 Id
    FROM #MyDataTable
    WHERE UserId = @UserId
    And IsSnapshot = 1
    ORDER BY Id DESC
);

SELECT Content 
FROM #MyDataTable
WHERE UserId = @UserId
AND (@lastSnapshotId IS NULL OR Id >= @lastSnapshotId)
ORDER BY Id ASC;

I'm not sure if its possible but could there be a performance increase if I did one call instead of two, not sure how to do this. I was thinking that a ROW_NUMBER() could work but then stumped.

Here is the execution plan.

enter image description here

I've tried to put together an example here:

DROP TABLE IF EXISTS #Nums

CREATE TABLE #Nums(n INT NOT NULL PRIMARY KEY)

INSERT INTO #Nums(n)
VALUES(0), (1), (2), (3), (4), (5), (6), (7), (8), (9)


DROP TABLE IF EXISTS #MyDataTable;

CREATE TABLE #MyDataTable (
    Id INT IDENTITY(1,1) NOT NULL CONSTRAINT PK_MyDataTable PRIMARY KEY,
    UserId INT, 
    IsSnapshot BIT, 
    Content NVARCHAR(500)
);

INSERT INTO #MyDataTable(UserId, IsSnapshot, Content)
SELECT ROUND(RAND(CHECKSUM(NEWID())) * (100), 0),
ROUND(RAND(CHECKSUM(NEWID())) * (1), 0),
'Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.'
FROM #Nums AS N1
CROSS JOIN #Nums AS N2
CROSS JOIN #Nums AS N3
CROSS JOIN #Nums AS N4

DROP INDEX IF EXISTS IX_MyDataTable_UserId ON #MyDataTable;

CREATE NONCLUSTERED INDEX [IX_MyDataTable_UserId] 
ON #MyDataTable ([UserId]);

--SELECT * FROM #MyDataTable

/*
Find the user with the most entried
-----------------------------------------------------------------
*/
DECLARE @UserWithMostEntries INT;

WITH UsersWithMostEntries AS (
    SELECT UserId, COUNT(*) AS [Count]
    FROM #MyDataTable
    GROUP BY UserId
    HAVING COUNT(*) > 1
)
SELECT TOP 1 @UserWithMostEntries = UserId FROM UsersWithMostEntries ORDER BY [Count] DESC

--SELECT @UserWithMostEntries

/*
The actual SQL I want to see if it can be improved, ideally with just one call not 2?
-----------------------------------------------------------------
*/
DECLARE @UserId INT = @UserWithMostEntries; -- Find one that has a snapshot
DECLARE @lastSnapshotId INT;

SET @lastSnapshotId = (
    SELECT TOP 1 Id
    FROM #MyDataTable
    WHERE UserId = @UserId
    And IsSnapshot = 1
    ORDER BY Id DESC
);

--SELECT @lastSnapshotId;

SELECT Content 
FROM #MyDataTable
WHERE UserId = @UserId
AND (@lastSnapshotId IS NULL OR Id >= @lastSnapshotId)
ORDER BY Id ASC;

CodePudding user response:

This query does what you need. The possible absence of a snapshot for the user is taken into account.

WITH lastSnapshot AS (
    SELECT MAX(Id) AS Id
    FROM #MyDataTable 
    WHERE UserId = 75 AND IsSnapshot = 1
    GROUP BY UserId
)
SELECT Content 
FROM #MyDataTable data
WHERE data.UserId = 75 
      AND 
      (
          NOT EXISTS (SELECT 1 FROM lastSnapshot) 
          OR 
          data.Id >= (SELECT Id FROM lastSnapshot)
      )
ORDER BY data.Id ASC

CodePudding user response:

As already suggested, a CTE (or equivalent derived table) can be used in most situations to convert two dependent queries into one. But I think the below approach more directly expresses your goal. You simply use a subquery to get the max ("top 1 desc" in your post - but I think it easier to understand using max) value for comparison purposes.

DECLARE @UserId INT = 75;

SELECT Content 
  FROM #MyDataTable
 WHERE UserId = @UserId
   AND Id = (SELECT MAX(Id) from #MyDataTable
              WHERE UserId = @UserId And IsSnapshot = 1)
 ORDER BY Id ASC;

Now you MIGHT need to allow for a situation where there is no matching row in the subquery. I am a bit suspicious of that idea but I'll leave that to you.

  • Related