I've got this table:
ID Date Event
----------------------------------------
123 2022-05-01 OCT
123 2022-05-04 OCT
123 2022-05-05 OCT
123 2022-05-07 OCT
123 2022-05-08 GRE
123 2022-05-10 GRE
123 2022-05-12 OCT
123 2022-05-15 OCT
What I wish is to count the number of events appearing in a row, in order to check if the event 'OCT' has happened 4 times in a row. for example:
Desirable output:
ID Date Event Order_Event
--------------------------------------------------------
123 2022-05-01 OCT 1
123 2022-05-04 OCT 2
123 2022-05-05 OCT 3
123 2022-05-07 OCT 4
123 2022-05-08 GRE 1
123 2022-05-10 GRE 2
123 2022-05-12 OCT 1
123 2022-05-15 OCT 2
What I have tried is the row number() but it gives me that:
ID Date Event Order_Event
--------------------------------------------------------
123 2022-05-01 OCT 1
123 2022-05-04 OCT 2
123 2022-05-05 OCT 3
123 2022-05-07 OCT 4
123 2022-05-08 GRE 1
123 2022-05-10 GRE 2
123 2022-05-12 OCT 5
123 2022-05-15 OCT 6
you see? I don't want it to count the events 'OCT' in the 12th and 15th of may as five and six.. i want to see only if each event happened 4 times in a row (by date)
Thank you!
CodePudding user response:
Try this:
DECLARE @DataSource TABLE
(
[ID] INT
,[Date] DATE
,[Event] VARCHAR(3)
);
INSERT INTO @DataSource ([ID], [Date], [Event])
VALUES (123, '2022-05-01', 'OCT')
,(123, '2022-05-04', 'OCT')
,(123, '2022-05-05', 'OCT')
,(123, '2022-05-07', 'OCT')
,(123, '2022-05-08', 'GRE')
,(123, '2022-05-10', 'GRE')
,(123, '2022-05-12', 'OCT')
,(123, '2022-05-15', 'OCT');
SELECT [ID], [Date], [Event]
,ROW_NUMBER() OVER (PARTITION BY grp ORDER BY [Date]) AS [ Order_Event]
FROM
(
SELECT *
,ROW_NUMBER() OVER(ORDER BY [Date]) - ROW_NUMBER() OVER(ORDER BY [Event], [Date]) AS grp
FROM @DataSource
) DS
ORDER BY [Date];
CodePudding user response:
One option here is to run two row numbers:
WITH sub AS
(SELECT id, date, event,
ROW_NUMBER() OVER(PARTITION BY id ORDER BY date)
AS RowNr1,
ROW_NUMBER() OVER (PARTITION BY id, event ORDER BY date) AS RowNr2
FROM yourtable)
SELECT id, date, event,
ROW_NUMBER() OVER (PARTITION BY id, RowNr1 - RowNr2 ORDER BY date) AS Order_Event
FROM sub
ORDER BY
id,
date;
The idea is to build one row number per id and one per id and event, both sorted by date. Then we do the partition by the difference of those row numbers. It's difficult to explain in words only, better see the fiddle here:
There you can replicate the result of the subquery with the two different row numbers and replicate how it works.
Side note: I recommend to rename the column "date" because this is a SQL key word and also not really meaningful. Better use something like "eventDate".
CodePudding user response:
You can define groups for your events by implementing a count on when the data changes using lag, and then number these individual groups sequentially:
with groups as (
select *,
Sum(case when Event = prevEvent then 0 else 1 end) over(partition by Id order by date) gp
from (
select *, Lag(Event) over(partition by Id order by date) prevEvent
from t
)t
)
select Id, Date, Event,
Row_Number() over(partition by Id, gp order by date) Order_Event
from groups
order by Id, Date;
Demo Fiddle