I basically want to group the set by timeline in SQL, I am so out of ideas right now obviously group by does not work and so does row number.
Any ideas for SQL are really appreciated.
select shift_date,associate_id,name,description , min(START_TRAN_DATE) as startdate, max(end_tran_date) as end_date
from ltu_vt
group by shift_date,associate_id,name,description
**SHIFT_DATE ID NAME DESC START_TRAN_DATE END_TRAN_DATE**
2022-11-13 42 John Doe ADP 2022-11-13 06:31:00.000 2022-11-13 06:31:22.000
2022-11-13 42 John Doe LINE 2022-11-13 06:31:22.000 2022-11-13 06:50:13.000
2022-11-13 42 John Doe HJ 2022-11-13 06:50:13.000 2022-11-13 06:50:13.000
2022-11-13 42 John Doe HJ 2022-11-13 06:52:13.000 2022-11-13 06:52:13.000
2022-11-13 42 John Doe HJ 2022-11-13 06:52:20.000 2022-11-13 06:52:20.000
2022-11-13 42 John Doe HJ 2022-11-13 06:52:25.000 2022-11-13 06:52:25.000
2022-11-13 42 John Doe HJ 2022-11-13 06:52:46.000 2022-11-13 06:52:46.000
2022-11-13 42 John Doe BG 2022-11-13 06:53:58.000 2022-11-13 06:53:58.000
2022-11-13 42 John Doe BG 2022-11-13 06:54:01.000 2022-11-13 06:54:01.000
2022-11-13 42 John Doe HJ 2022-11-13 07:13:49.000 2022-11-13 07:13:49.000
2022-11-13 42 John Doe P2L 2022-11-13 07:14:09.000 2022-11-13 07:14:09.000
2022-11-13 42 John Doe P2L 2022-11-13 07:19:48.000 2022-11-13 07:19:48.000
2022-11-13 42 John Doe ADP 2022-11-13 07:20:00.000 2022-11-13 07:20:00.000
expected output is
**SHIFT_DATE ID NAME DESC START_TRAN_DATE END_TRAN_DATE**
2022-11-13 42 John Doe ADP 2022-11-13 06:31:00.000 2022-11-13 06:31:22.000
2022-11-13 42 John Doe LINE 2022-11-13 06:31:22.000 2022-11-13 06:50:13.000
2022-11-13 42 John Doe HJ 2022-11-13 06:50:13.000 2022-11-13 06:52:46.000
2022-11-13 42 John Doe BG 2022-11-13 06:53:58.000 2022-11-13 06:54:01.000
2022-11-13 42 John Doe HJ 2022-11-13 07:13:49.000 2022-11-13 07:13:49.000
2022-11-13 42 John Doe P2L 2022-11-13 07:14:09.000 2022-11-13 07:19:48.000
2022-11-13 42 John Doe ADP 2022-11-13 07:20:00.000 2022-11-13 07:20:00.000
CodePudding user response:
From your expected output it seems that you want to group by description
, so this is how you could do it:
SELECT SHIFT_DATE,
associate_id ID,
Name
description DESC,
MIN(START_TRAN_DATE),
MAX(END_TRAN_DATE)
FROM ltu_vt
GROUP BY SHIFT_DATE, ASSOCIATE_ID, DESCRIPTION
Note that the only information I have is the info you have given, so, if I misunderstood the problem, provide more information, maybe a SQL fiddle as well.
CodePudding user response:
Please try the following solution.
It is a well known "gaps and islands" problem.
SQL
-- DDL and sample data population, start
DECLARE @tbl TABLE (SHIFT_DATE DATE, ID INT, [NAME] VARCHAR(20), [DESC] varchar(10), START_TRAN_DATE datetime, END_TRAN_DATE DATETIME);
INSERT @tbl (SHIFT_DATE, ID, NAME, [DESC], START_TRAN_DATE, END_TRAN_DATE) VALUES
('2022-11-13', 42, 'John Doe', 'ADP', '2022-11-13 06:31:00.000', '2022-11-13 06:31:22.000'),
('2022-11-13', 42, 'John Doe', 'LINE','2022-11-13 06:31:22.000', '2022-11-13 06:50:13.000'),
('2022-11-13', 42, 'John Doe', 'HJ', '2022-11-13 06:50:13.000', '2022-11-13 06:50:13.000'),
('2022-11-13', 42, 'John Doe', 'HJ', '2022-11-13 06:52:13.000', '2022-11-13 06:52:13.000'),
('2022-11-13', 42, 'John Doe', 'HJ', '2022-11-13 06:52:20.000', '2022-11-13 06:52:20.000'),
('2022-11-13', 42, 'John Doe', 'HJ', '2022-11-13 06:52:25.000', '2022-11-13 06:52:25.000'),
('2022-11-13', 42, 'John Doe', 'HJ', '2022-11-13 06:52:46.000', '2022-11-13 06:52:46.000'),
('2022-11-13', 42, 'John Doe', 'BG', '2022-11-13 06:53:58.000', '2022-11-13 06:53:58.000'),
('2022-11-13', 42, 'John Doe', 'BG', '2022-11-13 06:54:01.000', '2022-11-13 06:54:01.000'),
('2022-11-13', 42, 'John Doe', 'HJ', '2022-11-13 07:13:49.000', '2022-11-13 07:13:49.000'),
('2022-11-13', 42, 'John Doe', 'P2L', '2022-11-13 07:14:09.000', '2022-11-13 07:14:09.000'),
('2022-11-13', 42, 'John Doe', 'P2L', '2022-11-13 07:19:48.000', '2022-11-13 07:19:48.000'),
('2022-11-13', 42, 'John Doe', 'ADP', '2022-11-13 07:20:00.000', '2022-11-13 07:20:00.000');
-- DDL and sample data population, end
SELECT series, SHIFT_DATE, ID, NAME, [DESC], MIN(START_TRAN_DATE) AS Date_Start, MAX(END_TRAN_DATE) AS Date_End, COUNT(SHIFT_DATE) AS Shift_Counter
FROM
(
SELECT *,
SUM(IIF([DESC] <> ns, 1, 0)) OVER (ORDER BY START_TRAN_DATE) AS series
FROM
(
SELECT series.*,
LAG([DESC]) OVER (ORDER BY START_TRAN_DATE) AS ns
FROM @tbl AS series
) q
) q
GROUP BY series, SHIFT_DATE, ID, NAME, [DESC]
ORDER BY series;
Output
series | SHIFT_DATE | ID | NAME | DESC | Date_Start | Date_End | Shift_Counter |
---|---|---|---|---|---|---|---|
0 | 2022-11-13 | 42 | John Doe | ADP | 2022-11-13 06:31:00.000 | 2022-11-13 06:31:22.000 | 1 |
1 | 2022-11-13 | 42 | John Doe | LINE | 2022-11-13 06:31:22.000 | 2022-11-13 06:50:13.000 | 1 |
2 | 2022-11-13 | 42 | John Doe | HJ | 2022-11-13 06:50:13.000 | 2022-11-13 06:52:46.000 | 5 |
3 | 2022-11-13 | 42 | John Doe | BG | 2022-11-13 06:53:58.000 | 2022-11-13 06:54:01.000 | 2 |
4 | 2022-11-13 | 42 | John Doe | HJ | 2022-11-13 07:13:49.000 | 2022-11-13 07:13:49.000 | 1 |
5 | 2022-11-13 | 42 | John Doe | P2L | 2022-11-13 07:14:09.000 | 2022-11-13 07:19:48.000 | 2 |
6 | 2022-11-13 | 42 | John Doe | ADP | 2022-11-13 07:20:00.000 | 2022-11-13 07:20:00.000 | 1 |