I am stuck on a problem. I want to remove consecutive duplicate records in the table,
I,e, in the below table I want to calculate the total cost without consecutive duplication.
Like, row 3 should be removed as it is consecutively duplicated with row 2 as all 3 column data is the same.
And same is the case in the second group, row 7 should be removed as it is a duplication of 6.
The total cost at the end should be 10.
How can I do it in SSMS?
ClaimID | ClaimLine | Cost |
---|---|---|
M0001R1616878951 | 2 | 10 |
M0001R1616878951 | 2 | -10 |
M0001R1616878951 | 2 | -10 |
M0001R1616878951 | 3 | 10 |
M0001R1616878951 | 3 | -10 |
M0001R1616878951 | 3 | 10 |
M0001R1616878951 | 3 | 10 |
I searched for this problem and tried lead and lag keywords but didnt work.
CodePudding user response:
I prepared an example for your issue and it might be solution for that.
I used CTE, ROW_NUMBER and IIF expressions to generate row_number and filter for duplicate rows.
Preparing the example data:
DECLARE @vClaims TABLE (
ClaimID NVARCHAR(16),
ClaimLine SMALLINT,
Cost SMALLINT
)
INSERT INTO @vClaims
VALUES
('M0001R1616878951', 2, 10),
('M0001R1616878951', 2, -10),
('M0001R1616878951', 2, -10),
('M0001R1616878951', 3, 10),
('M0001R1616878951', 3, -10),
('M0001R1616878951', 3, 10),
('M0001R1616878951', 3, 10)
And the query script:
;WITH CTE_ClaimsWithSort AS (
SELECT
ClaimID,
ClaimLine,
Cost,
RowNumber = ROW_NUMBER() OVER(ORDER BY (SELECT NULL))
FROM
@vClaims
), CTE_ClaimsFiltered AS (
SELECT
ClaimID,
ClaimLine,
Cost,
RowNumber,
isDuplicate = IIF(
LAG(ClaimID) OVER(ORDER BY RowNumber) = ClaimID
AND LAG(ClaimLine) OVER(ORDER BY RowNumber) = ClaimLine
AND LAG(Cost) OVER(ORDER BY RowNumber) = Cost
, 1, 0)
FROM
CTE_ClaimsWithSort
)
SELECT
ClaimID,
ClaimLine,
Cost,
RowNumber,
isDuplicate
FROM
CTE_ClaimsFiltered
WHERE
isDuplicate = 0
First part of cte: generate row_numbers for your example data. If you have a date column you can use instead of it.
Second part of cte: find and filter duplicate rows by ClaimID, ClaimLine and Cost with IIF expression
The result:
ClaimID | ClaimLine | Cost | RowNumber | isDuplicate |
---|---|---|---|---|
M0001R1616878951 | 2 | 10 | 1 | 0 |
M0001R1616878951 | 2 | -10 | 2 | 0 |
M0001R1616878951 | 3 | 10 | 4 | 0 |
M0001R1616878951 | 3 | -10 | 5 | 0 |
M0001R1616878951 | 3 | 10 | 6 | 0 |