Home > Enterprise >  Condense or merge rows with null values not using group by
Condense or merge rows with null values not using group by

Time:01-27

Let's say I have a select which returns the following Data:

select nr, name, val_1, val_2, val_3 
from table
 Nr. | Name       | Value 1 | Value 2 | Value 3
----- ------------ --------- --------- ---------
   1 | Max        | 123     | NULL    | NULL 
   1 | Max        | NULL    | 456     | NULL 
   1 | Max        | NULL    | NULL    | 789
   9 | Lisa       | 1       | NULL    | NULL
   9 | Lisa       | 3       | NULL    | NULL
   9 | Lisa       | NULL    | NULL    | Hello
   9 | Lisa       | 9       | NULL    | NULL

I'd like to condense the rows down to the bare minimum with.

I want the following result:

 Nr. | Name       | Value 1 | Value 2 | Value 3
----- ------------ --------- --------- ---------
   1 | Max        | 123     | 456     | 789
   9 | Lisa       | 1       | NULL    | Hello
   9 | Lisa       | 3       | NULL    | NULL
   9 | Lisa       | 9       | NULL    | NULL

For condensing the rows with Max (Nr. 1) a group by of the max values would help.

select nr, name, max(val_1), max(val_2), max(val_3) 
from table 
group by nr, name 

But I am unsure how to get the desired results for Lisa (Nr. 9). The row for Lisa contains a value in the Value 3 column, in this example it's condensed with the first row that matches Nr and Name and has a Null value in Value 3.

I'm thankful for every input!

CodePudding user response:

Basic principle is same as Vladimir's solution. This uses UNPIVOT and PIVOT

with cte as
(
    select nr, name, col, val,
           rn = row_number() over(partition by nr, name, col order by val) 
    from   [table]
           unpivot
           (
                val
                for col in (val_1, val_2, val_3)
           ) u
)
select *
from   (
           select nr, name, rn, col, val
           from   cte
       ) d
       pivot
       (
           max (val)
           for col in ([val_1], [val_2], [val_3])
       ) p

CodePudding user response:

Here is one way to do it. Assign a unique row number for each column by sorting them in such a way that NULLs come last and then join them back together using these row numbers and remove rows with all NULLs.

Run just the CTE first and examine the intermediate result to understand how it works.

Sample data

DECLARE @T TABLE (Nr varchar(10), Name varchar(10), V1 varchar(10), V2 varchar(10), V3 varchar(10));
INSERT INTO @T VALUES
('1', 'Max ', '123'  , NULL , NULL ),
('1', 'Max ', NULL   , '456', NULL ),
('1', 'Max ', NULL   , NULL , '789'),
('9', 'Lisa', '1'    , NULL , NULL ),
('9', 'Lisa', '3'    , NULL , NULL ),
('9', 'Lisa', NULL   , NULL , 'Hello'),
('9', 'Lisa', '9'    , NULL , NULL );

Query

WITH CTE
AS
(
    SELECT
        Nr
        ,Name
        ,V1
        ,V2
        ,V3
        -- here we use CASE WHEN V1 IS NULL THEN 1 ELSE 0 END to put NULLs last
        ,ROW_NUMBER() OVER (PARTITION BY Nr ORDER BY CASE WHEN V1 IS NULL THEN 1 ELSE 0 END, V1) AS rn1
        ,ROW_NUMBER() OVER (PARTITION BY Nr ORDER BY CASE WHEN V2 IS NULL THEN 1 ELSE 0 END, V2) AS rn2
        ,ROW_NUMBER() OVER (PARTITION BY Nr ORDER BY CASE WHEN V3 IS NULL THEN 1 ELSE 0 END, V3) AS rn3
    FROM @T AS T
)
SELECT
    T1.Nr
    ,T1.Name
    ,T1.V1
    ,T2.V2
    ,T3.V3
FROM
    CTE AS T1
    INNER JOIN CTE AS T2 ON T2.Nr = T1.Nr AND T2.rn2 = T1.rn1
    INNER JOIN CTE AS T3 ON T3.Nr = T1.Nr AND T3.rn3 = T1.rn1
WHERE
    T1.V1 IS NOT NULL
    OR T2.V2 IS NOT NULL
    OR T3.V3 IS NOT NULL
ORDER BY
    T1.Nr, T1.rn1
;

Result

 ---- ------ ----- ------ ------- 
| Nr | Name | V1  |  V2  |  V3   |
 ---- ------ ----- ------ ------- 
|  1 | Max  | 123 | 456  | 789   |
|  9 | Lisa |   1 | NULL | Hello |
|  9 | Lisa |   3 | NULL | NULL  |
|  9 | Lisa |   9 | NULL | NULL  |
 ---- ------ ----- ------ ------- 
  • Related