I have a problem and I am new to SQL so I am not sure how to solve it. I have a table with two columns and I want to start at the earliest date and take the next occurrence (of the name column) as the end date.
I know I want to group by Name, but I am not sure what to do after that:
Name Date
x Jan-01
y Feb-01
z Mar-01
x Jan-02
y Feb-02
z Mar-02
x Jan-03
y Feb-03
z Mar-03
x Jan-04
y Feb-04
z Mar-04
x Jan-05
y Feb-05
z Mar-05
I want the resulting table to look like this:
Name DateStart DateEnd
x Jan-01 Jan-02
y Feb-01 Feb-02
z Mar-01 Mar-02
x Jan-03 Jan-04
y Feb-03 Feb-04
z Mar-03 Mar-04
x Jan-05 NULL
y Feb-05 NULL
z Mar-05 NULL
CodePudding user response:
Assuming your date column be a proper sortable date and not text, then we can approach this as a gaps and islands problem. We can use the difference in row numbers method.
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY Date) rn1,
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Date) rn2
FROM yourTable
)
SELECT Name, MIN(Date) AS DateStart, MAX(Date) AS DateEnd
FROM cte
GROUP BY Name, rn1 - rn2
ORDER BY MIN(Date);
CodePudding user response:
If I understand the question correctly and you want to make pairs of dates for each Name
, the following statement is an option:
Test data:
SELECT *
INTO Data
FROM (VALUES
('x', CONVERT(date, 'Jan 01 2022', 106)),
('y', CONVERT(date, 'Feb 01 2022', 106)),
('z', CONVERT(date, 'Mar 01 2022', 106)),
('x', CONVERT(date, 'Jan 02 2022', 106)),
('y', CONVERT(date, 'Feb 02 2022', 106)),
('z', CONVERT(date, 'Mar 02 2022', 106)),
('x', CONVERT(date, 'Jan 03 2022', 106)),
('y', CONVERT(date, 'Feb 03 2022', 106)),
('z', CONVERT(date, 'Mar 03 2022', 106)),
('x', CONVERT(date, 'Jan 04 2022', 106)),
('y', CONVERT(date, 'Feb 04 2022', 106)),
('z', CONVERT(date, 'Mar 04 2022', 106)),
('x', CONVERT(date, 'Jan 05 2022', 106)),
('y', CONVERT(date, 'Feb 05 2022', 106)),
('z', CONVERT(date, 'Mar 05 2022', 106))
) v (Name, Date)
Statement:
SELECT
Name,
StartDate = MIN(CASE WHEN (Rn - 1) % 2 = 0 THEN Date END),
EndDate = MAX(CASE WHEN (Rn - 1) % 2 = 1 THEN Date END)
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Date) AS Rn
FROM Data
) t
GROUP BY Name, ((Rn - 1) / 2)
ORDER BY Name
Result:
Name | StartDate | EndDate |
---|---|---|
x | 2022-01-01 | 2022-01-02 |
x | 2022-01-03 | 2022-01-04 |
x | 2022-01-05 | null |
y | 2022-02-01 | 2022-02-02 |
y | 2022-02-03 | 2022-02-04 |
y | 2022-02-05 | null |
z | 2022-03-01 | 2022-03-02 |
z | 2022-03-03 | 2022-03-04 |
z | 2022-03-05 | null |