Home > Net >  SQL - Aggregate on multiple columns
SQL - Aggregate on multiple columns

Time:11-26

I have a problem and I am new to SQL so I am not sure how to solve it. I have a table with two columns and I want to start at the earliest date and take the next occurrence (of the name column) as the end date.

I know I want to group by Name, but I am not sure what to do after that:

 Name     Date
  x      Jan-01    
  y      Feb-01
  z      Mar-01
  x      Jan-02
  y      Feb-02
  z      Mar-02
  x      Jan-03
  y      Feb-03
  z      Mar-03
  x      Jan-04
  y      Feb-04
  z      Mar-04
  x      Jan-05
  y      Feb-05
  z      Mar-05

I want the resulting table to look like this:

 Name DateStart DateEnd
  x    Jan-01    Jan-02
  y    Feb-01    Feb-02
  z    Mar-01    Mar-02
  x    Jan-03    Jan-04
  y    Feb-03     Feb-04
  z    Mar-03     Mar-04
  x    Jan-05     NULL
  y    Feb-05     NULL       
  z    Mar-05     NULL

CodePudding user response:

Assuming your date column be a proper sortable date and not text, then we can approach this as a gaps and islands problem. We can use the difference in row numbers method.

WITH cte AS (
    SELECT *, ROW_NUMBER() OVER (ORDER BY Date) rn1,
              ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Date) rn2
    FROM yourTable
)

SELECT Name, MIN(Date) AS DateStart, MAX(Date) AS DateEnd
FROM cte
GROUP BY Name, rn1 - rn2
ORDER BY MIN(Date);

CodePudding user response:

If I understand the question correctly and you want to make pairs of dates for each Name, the following statement is an option:

Test data:

SELECT *
INTO Data
FROM (VALUES
   ('x', CONVERT(date, 'Jan 01 2022', 106)),    
   ('y', CONVERT(date, 'Feb 01 2022', 106)),
   ('z', CONVERT(date, 'Mar 01 2022', 106)),
   ('x', CONVERT(date, 'Jan 02 2022', 106)),
   ('y', CONVERT(date, 'Feb 02 2022', 106)),
   ('z', CONVERT(date, 'Mar 02 2022', 106)),
   ('x', CONVERT(date, 'Jan 03 2022', 106)),
   ('y', CONVERT(date, 'Feb 03 2022', 106)),
   ('z', CONVERT(date, 'Mar 03 2022', 106)),
   ('x', CONVERT(date, 'Jan 04 2022', 106)),
   ('y', CONVERT(date, 'Feb 04 2022', 106)),
   ('z', CONVERT(date, 'Mar 04 2022', 106)),
   ('x', CONVERT(date, 'Jan 05 2022', 106)),
   ('y', CONVERT(date, 'Feb 05 2022', 106)),
   ('z', CONVERT(date, 'Mar 05 2022', 106))
) v (Name, Date)

Statement:

SELECT 
   Name, 
   StartDate = MIN(CASE WHEN (Rn - 1) % 2 = 0 THEN Date END),
   EndDate   = MAX(CASE WHEN (Rn - 1) % 2 = 1 THEN Date END)
FROM (  
   SELECT *, ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Date) AS Rn
   FROM Data
) t
GROUP BY Name, ((Rn - 1) / 2)
ORDER BY Name

Result:

Name StartDate EndDate
x 2022-01-01 2022-01-02
x 2022-01-03 2022-01-04
x 2022-01-05 null
y 2022-02-01 2022-02-02
y 2022-02-03 2022-02-04
y 2022-02-05 null
z 2022-03-01 2022-03-02
z 2022-03-03 2022-03-04
z 2022-03-05 null
  • Related