Home > Back-end >  Return only the modified records in a table, using T-SQL
Return only the modified records in a table, using T-SQL

Time:05-23

I'm attempting to return only the modified records in a table, using T-SQL.

This is what I've done thus far:

 BEGIN
    IF OBJECT_ID('tempdb..#Test') IS NOT NULL
        DROP TABLE #Test

    CREATE TABLE #Test
        (SetName nvarchar(100),
        [Timestamp] datetime,
        Value smallint)

    INSERT INTO #Test VALUES('Alpha', GETDATE(), 1)
    INSERT INTO #Test VALUES('Alpha', GETDATE(), 0)
    INSERT INTO #Test VALUES('Alpha', GETDATE(), 1)
    INSERT INTO #Test VALUES('Beta', GETDATE(), 1)
    INSERT INTO #Test VALUES('Beta', GETDATE(), 1)
    INSERT INTO #Test VALUES('Beta', GETDATE(), 1)
    INSERT INTO #Test VALUES('Gamma', GETDATE(), 1)
    INSERT INTO #Test VALUES('Gamma', GETDATE(), 0)
    INSERT INTO #Test VALUES('Gamma', GETDATE(), 1)

    SELECT * FROM #Test

END;

Results:

SetName Timestamp               Value

Alpha   2022-05-23 12:58:41.100 1
Alpha   2022-05-23 12:58:41.101 0
Alpha   2022-05-23 12:58:41.102 1
Beta    2022-05-23 12:58:41.103 1
Beta    2022-05-23 12:58:41.104 1
Beta    2022-05-23 12:58:41.105 1
Gamma   2022-05-23 12:58:41.106 1
Gamma   2022-05-23 12:58:41.107 0
Gamma   2022-05-23 12:58:41.108 1

Expected results:

Alpha   2022-05-23 12:58:41.101 0 -- Changed from 1 to 0
Alpha   2022-05-23 12:58:41.102 1 -- Changed from 0 to 1
Gamma   2022-05-23 12:58:41.107 0 -- Changed from 1 to 0
Gamma   2022-05-23 12:58:41.108 1 -- Changed from 0 to 1

The following statement returns all the 0 to 1, and 1 to 0 records and I don't understand why:

;WITH cte AS
(
SELECT 
    SetName, [Timestamp], Value, lag(Value, 1, -1) OVER (ORDER BY [Timestamp]) AS LastValue
FROM #Test
) 

SELECT 
    SetName, [Timestamp], [Value]
FROM
    cte
WHERE value <> LastValue

CodePudding user response:

This solution assumes that 2 rows for a single SetName cannot have the same value for Timestamp (2 rows for different values of SetName is fine).

One method is to use gaps and islands to put the values into groups, and then omit group 0 from the data:

WITH CTE AS(
    SELECT SetName,
           [Timestamp], --timestamp is a deprecrated synonym for rowversion; I don't recommend it's use as a column name
           [value],
           ROW_NUMBER() OVER (PARTITION BY SetName ORDER BY Timestamp) - 
           ROW_NUMBER() OVER (PARTITION BY SetName, [Value] ORDER BY Timestamp) AS Grp
    FROM #Test)
SELECT SetName,
       [Timestamp], --timestamp is a deprecrated synonym for rowversion; I don't recommend it's use as a column name
       [value]
FROM CTE
WHERE Grp != 0
ORDER BY [Timestamp];

db<>fiddle

If 2 rows can have the same value, then hopefully you have an always ascending value you can use in the ORDER BY as well; otherwise you have no way of knowing what the "first" row is.

CodePudding user response:

To track the changes at whole dataset, you were almost there, you just need to remove the first row by using LastValue <> -1.

WITH CTE AS
(
SELECT 
    SetName, [Timestamp], Value, lag(Value, 1, -1) OVER (ORDER BY [Timestamp]) AS LastValue
FROM #Test
) 

SELECT 
    SetName, [Timestamp], [Value]
FROM
    CTE
WHERE value<>Lastvalue
AND Lastvalue<> -1; --Add this filter to remove the rows which doesn't have any Lag Value

To track the changes at SetName level.

WITH CTE AS
(
SELECT 
   SetName, [Timestamp], Value, Lag(Value, 1, -1) OVER (PARTITION BY SetName ORDER BY [Timestamp]) AS LastValue
FROM #Test
) 

SELECT 
    SetName, [Timestamp], [Value]
FROM
    CTE
WHERE Value<>LastValue
AND LastValue <> -1;

db<>fiddle: Try here

Note: From the given dataset, modified records at SetName level and at whole dataset is same.

Assumption: Each Timestamp value is different.

  • Related