I have a table, communicats
, in my SQL Server database that contains data as below:
Id Current_Status Date Other_Columns
---------------------------------------------------
3aaab2ac | T | 2021-01-25 |
3aaab2ac | T | 2021-02-17 |
3aaab2ac | F | 2021-02-27 |
3aaab2ac | F | 2021-02-28 |
3aaab2ac | T | 2021-05-13 |
3aaab2ac | T | 2021-06-11 |
3aaab2ac | F | 2021-06-28 |
3aaab2ac | T | 2021-07-13 |
3aaab2ac | T | 2021-07-19 |
I want to format data like this:
Id Current_Status Date Row_ID Other_Columns
-------------------------------------------------------------
3aaab2ac | T | 2021-01-25 | 1 |
3aaab2ac | T | 2021-02-17 | 1 |
3aaab2ac | F | 2021-02-27 | 2 |
3aaab2ac | F | 2021-02-28 | 2 |
3aaab2ac | T | 2021-05-13 | 3 |
3aaab2ac | T | 2021-06-11 | 3 |
3aaab2ac | F | 2021-06-28 | 4 |
3aaab2ac | T | 2021-07-13 | 5 |
3aaab2ac | T | 2021-07-19 | 5 |
I tried something like that:
SELECT
*,
DENSE_RANK() OVER (ORDER BY Id, Current_status) as Row_Id
FROM
communicats
But result of that query is only:
Id Current_Status Date Row_ID Other_Columns
-------------------------------------------------------------
3aaab2ac | T | 2021-01-25 | 1 |
3aaab2ac | T | 2021-02-17 | 1 |
3aaab2ac | F | 2021-02-27 | 2 |
3aaab2ac | F | 2021-02-28 | 2 |
3aaab2ac | T | 2021-05-13 | 1 |
3aaab2ac | T | 2021-06-11 | 1 |
3aaab2ac | F | 2021-06-28 | 2 |
3aaab2ac | T | 2021-07-13 | 1 |
3aaab2ac | T | 2021-07-19 | 1 |
So I would like to give each group containing the same status an ascending Id number.
Necessarily, I need this to remain in a form that does not shorten the number of records.
How can I get this result?
CodePudding user response:
This is a type of gaps-and-islands problem and a common solution is:
Lag
the 'Current_Status' and detect when it changes.- Add 1 every time is changes and sum this value.
declare @communicats table (Id varchar(8), Current_Status varchar(1), [Date] date);
insert into @communicats (Id, Current_Status, [Date])
values
('3aaab2ac', 'T', '2021-01-25')
,('3aaab2ac', 'T', '2021-02-17')
,('3aaab2ac', 'F', '2021-02-27')
,('3aaab2ac', 'F', '2021-02-28')
,('3aaab2ac', 'T', '2021-05-13')
,('3aaab2ac', 'T', '2021-06-11')
,('3aaab2ac', 'F', '2021-06-28')
,('3aaab2ac', 'T', '2021-07-13')
,('3aaab2ac', 'T', '2021-07-19');
WITH cte AS (
SELECT *
, CASE WHEN LAG(Current_Status) OVER (ORDER BY [Date]) <> Current_Status THEN 1 ELSE 0 END LaggedStatus
FROM @communicats
)
SELECT id, Current_Status, [Date]
, 1 SUM(LaggedStatus) OVER (ORDER BY [Date]) Row_ID
FROM cte
ORDER BY [Date];
Returns:
id | Current_Status | Date | Row_ID |
---|---|---|---|
3aaab2ac | T | 2021-01-25 | 1 |
3aaab2ac | T | 2021-02-17 | 1 |
3aaab2ac | F | 2021-02-27 | 2 |
3aaab2ac | F | 2021-02-28 | 2 |
3aaab2ac | T | 2021-05-13 | 3 |
3aaab2ac | T | 2021-06-11 | 3 |
3aaab2ac | F | 2021-06-28 | 4 |
3aaab2ac | T | 2021-07-13 | 5 |
3aaab2ac | T | 2021-07-19 | 5 |
Note: If you provide your sample data as DDL DML (as shown here) you make it much easier for people to assist.
CodePudding user response:
Your target table des not make any sense. As far as I understood, you can try using window functions rank()
or dense_rank()
. But for using that you need to be clear on which column you are going to order the rank by.