I have a table for employee with comment spread across multiple rows. Those need to be joined into a single row. To identify which rows can be joined, we need to use date field - if date is present and there is subsequent row with no date then that denote start of comment for employee. If however there is single row with no date with no prior date row as well then that is considered as new comment. The order in which comment are entered (identity column) is also provided so LEAD function was the way I was trying Below Table is what we have: Table screenshot
EmployeeId | Date | Comment | Order |
---|---|---|---|
1001 | 2021-01-08 | This is only first part | 1 |
1001 | NULL | this is the second | 2 |
1001 | NULL | and this is third part | 3 |
1001 | 2021-01-15 | This is a new comment for same | 4 |
1002 | 2021-01-16 | This one has subsequent comment | 5 |
1002 | 2021-01-16 | The second comment | 6 |
1003 | NULL | This is single comment | 7 |
1003 | 2021-01-12 | This is also single comment | 8 |
The result we expect is : Result Expected
EmployeeId | Date | Comment | Order |
---|---|---|---|
1001 | 2021-01-08 | This is only first part this is the second and this is third part | 1 |
1001 | NULL | This is a new comment for same | 4 |
1002 | 2021-01-15 | This one has subsequent comment The second comment | 5 |
1003 | 2021-01-16 | This is single comment | 7 |
1003 | 2021-01-16 | This is also single comment | 8 |
I am trying the lead function but not able to get how to join n number of row based on condition. Any help?
SQL :
CREATE TABLE Comments(
[EmployeeID] [int] NOT NULL,
[Date] [date] null,
[Comment] [varchar](100) NULL,
[Order] [int] NULL
)
INSERT INTO Comments VALUES('1001','1/8/2021', 'This is only first part', 1)
INSERT INTO Comments VALUES('1001',NULL, 'this is the second', 2)
INSERT INTO Comments VALUES('1001',NULL, 'and this is third part', 3)
INSERT INTO Comments VALUES('1001','1/15/2021', 'This is a new comment for same', 4)
INSERT INTO Comments VALUES('1002','1/16/2021', 'This one has subsequent comment', 5)
INSERT INTO Comments VALUES('1002','1/16/2021', 'The second comment', 6)
INSERT INTO Comments VALUES('1003',NULL, 'This is single comment', 7)
INSERT INTO Comments VALUES('1003','1/12/2021', 'This is also single comment', 8)
CodePudding user response:
I left a bunch of comments about details in your "expected results" that did not make sense given your requirements. If I go by your stated requirements then here is a solution:
First normalize the table to have dates so we can use group by
SELECT EmployeeID,
COALESCE([Date],
LAG([Date] OVER (ORDER BY [Order] ASC), -- Get the prior one if null
MIN([Date] OVER (PARTITION BY EmployeeID ORDER BY [Date] ASC)) AS [Date], -- Get the smallest one if the last two are null
Comment,
[Order]
FROM sometableyoudidnotname
Now that we have this table we can use group by and string_agg
SELECT EmployeeID,
MIN(cDate) as [DATE],
STRING_AGG(Comment, ' ') WITHIN GROUP (ORDER BY [ORDER] ASC) AS Comment
FROM (
SELECT EmployeeID,
COALESCE([Date],
LAG([Date] OVER (ORDER BY [Order] ASC), -- Get the prior one if null
MIN([Date] OVER (PARTITION BY EmployeeID ORDER BY [Date] ASC)) AS [Date], -- Get the smallest one if the last two are null
Comment,
[Order]
FROM sometableyoudidnotname) X
GROUP BY EmployeeID