Home > Mobile >  Joining column spread across multiple rows based on condition
Joining column spread across multiple rows based on condition

Time:06-20

I have a table for employee with comment spread across multiple rows. Those need to be joined into a single row. To identify which rows can be joined, we need to use date field - if date is present and there is subsequent row with no date then that denote start of comment for employee. If however there is single row with no date with no prior date row as well then that is considered as new comment. The order in which comment are entered (identity column) is also provided so LEAD function was the way I was trying Below Table is what we have: Table screenshot

EmployeeId Date Comment Order
1001 2021-01-08 This is only first part 1
1001 NULL this is the second 2
1001 NULL and this is third part 3
1001 2021-01-15 This is a new comment for same 4
1002 2021-01-16 This one has subsequent comment 5
1002 2021-01-16 The second comment 6
1003 NULL This is single comment 7
1003 2021-01-12 This is also single comment 8

The result we expect is : Result Expected

EmployeeId Date Comment Order
1001 2021-01-08 This is only first part this is the second and this is third part 1
1001 NULL This is a new comment for same 4
1002 2021-01-15 This one has subsequent comment The second comment 5
1003 2021-01-16 This is single comment 7
1003 2021-01-16 This is also single comment 8

I am trying the lead function but not able to get how to join n number of row based on condition. Any help?

SQL :

CREATE TABLE Comments(
    [EmployeeID] [int] NOT NULL,
    [Date] [date] null,
    [Comment] [varchar](100) NULL,
    [Order] [int] NULL
    )
INSERT INTO Comments VALUES('1001','1/8/2021', 'This is only first part', 1)
INSERT INTO Comments VALUES('1001',NULL, 'this is the second', 2)
INSERT INTO Comments VALUES('1001',NULL, 'and this is third part', 3)
INSERT INTO Comments VALUES('1001','1/15/2021', 'This is a new comment for same', 4)
INSERT INTO Comments VALUES('1002','1/16/2021', 'This one has subsequent comment', 5)
INSERT INTO Comments VALUES('1002','1/16/2021', 'The second comment', 6)
INSERT INTO Comments VALUES('1003',NULL, 'This is single comment', 7)
INSERT INTO Comments VALUES('1003','1/12/2021', 'This is also single comment', 8)

CodePudding user response:

I left a bunch of comments about details in your "expected results" that did not make sense given your requirements. If I go by your stated requirements then here is a solution:

First normalize the table to have dates so we can use group by

SELECT EmployeeID, 
       COALESCE([Date],
                LAG([Date] OVER (ORDER BY [Order] ASC), -- Get the prior one if null
                MIN([Date] OVER (PARTITION BY EmployeeID ORDER BY [Date] ASC)) AS [Date],  -- Get the smallest one if the last two are null
       Comment,
       [Order]
FROM sometableyoudidnotname

Now that we have this table we can use group by and string_agg

SELECT EmployeeID,
       MIN(cDate) as [DATE],
       STRING_AGG(Comment, ' ') WITHIN GROUP (ORDER BY [ORDER] ASC) AS Comment
FROM (
  SELECT EmployeeID, 
         COALESCE([Date],
                  LAG([Date] OVER (ORDER BY [Order] ASC), -- Get the prior one if null
                  MIN([Date] OVER (PARTITION BY EmployeeID ORDER BY [Date] ASC)) AS [Date],  -- Get the smallest one if the last two are null
         Comment,
         [Order]
  FROM sometableyoudidnotname) X
GROUP BY EmployeeID
  • Related