My source events table looks like this:
The result set I need should not include the duplicated "OPEN" values (4, 5), and if possible, return the first occurrence only.
The closest I've been able to get is to group them by looking at answers discussing "gaps and islands" but I'm lost as to how to actually "eliminate" the unneeded rows.
CodePudding user response:
Rather than using gaps and islands, I'd suggest using a simple LAG
to get the previous row's OpenOrClosed
value - then check if that previous row's OpenOrClosed
value is different from the current row's.
This involves a first SELECT to get all rows, with LAG to get the corresponding previous row's value - and then a second SELECT to filter out only those that are different.
See this db<>fiddle which was based on your original one.
SELECT [ID]
,[EventDateTime]
,[SourceEventID]
,[OpenOrClosed]
FROM
(SELECT [ID]
,[EventDateTime]
,[SourceEventID]
,[OpenOrClosed]
,LAG([OpenOrClosed], 1) OVER (ORDER BY ID) AS [LastOOC]
FROM [dbo].[test]
) AS SourceData
WHERE LastOOC IS NULL OR OpenOrClosed <> LastOOC
Results are as below
ID EventDateTime SourceEventID OpenOrClosed
1 2013-01-31 14:20:00.000 331832 OPEN
2 2013-04-18 14:44:00.000 338907 CLOSED
3 2013-04-19 15:30:00.000 341210 OPEN
6 2013-04-24 12:22:00.000 339250 CLOSED
7 2013-05-08 12:32:00.000 340281 OPEN
8 2013-05-08 12:33:00.000 340282 CLOSED
9 2013-05-08 12:34:00.000 340255 OPEN