Home > Software design >  Concatenate rows based on multiple column values
Concatenate rows based on multiple column values

Time:11-13

I am trying to get a new column with a concatenation of all distinct row values. This aggregation would be based on other columns.

enter image description here

I have tried the following but I get the same values repeated in the new column (A1, A1, A4). I need the concatenation to be distinct.

SELECT
    STRING_AGG(COLUMN1, ', ') AS COLUMN1_ALIAS
    ,COLUMN2
    ,COLUMN3
    ,COLUMN4
FROM TABLE
GROUP BY COLUMN2 ,COLUMN3 ,COLUMN4

CodePudding user response:

It looks like you want windowing rather than aggregation. Unfortunately, string_agg does not support over() in SQL Server ; neither does it support distinct in its aggregated form.

We could work around it with subqueries ; it is probably more efficient to deduplicate and pre-compute the aggregates first, then join with the original table:

select t.*, x.column1_alias
from mytable t
inner join (
    select column2, column3, column4, string_agg(column1, ', ') as column1_alias
    from (select distinct column1, column2, column3, column4 from mytable) t
    group by column2, column3, column4
) x on x.column2 = t.column2 and x.column3 = t.column3 and x.column4 = t.column4

Side note : in a database that supports both over() and distinct on string aggregation, the query would phrase as:

select t.*, 
    string_agg(distinct column4, ', ') 
        over(partition by column2, column3, column4) as column1_alias
from mytable t
  • Related