Home > OS >  SQL: Join fact on dimension using bridge table and return concatenated list for column of interest
SQL: Join fact on dimension using bridge table and return concatenated list for column of interest

Time:11-15

I would like to join multiple dimension tables to a fact table. However, instead of returning all the results, I want to avoid a one-to-many relationship result. Instead, I would like to access just some of the data in one table, concatenate all of its findings, and return them into a single column so that the expected result of 1 row per fact remains. I do not want a one-to-many relationship in my result.

If I pair this answer How to concatenate text from multiple rows into a single text string in SQL Server with additional columns of interest, I get an error, saying: is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause

-- sample values
-- bridge
dim2Key, groupKey
1,          1
2,          1
3,          1
4,          2

-- dim2
dim2Key, attributeOne
1,      'A'
2,      'B'
3,      'C'
4,      'A'

-- dim1
dim1Key, attributeTwo, attributeThree,
1,      35,                  'val1'      
2,      25,                  'val2'   
3,      45,                  'val3'   
4,      55,                  'val1'   

-- fact1
dim1Key, factvalue1, groupKey,
1,        5,            1
2,        25,           1
3,        55,           -1
4,        99,           2


-- expected values
-- fact1
dim1Key, factvalue1, groupKey, attributeTwo, attributeThree, attributeOne
1,          5,          1,          35,         'val1',         'A, B, C'
...
4,          99,         2,           55,         'val1',            'A'

CodePudding user response:

It's very unclear what your schema and joins should be, but it seems you want to aggregate dim2 per row of fact1.

You can aggregate dim2 inside a correlated subquery. It's often nicer to put this in an APPLY but you could also put it directly in the SELECT

SELECT
  fact1.dim1Key,
  fact1.factvalue1,
  fact1.groupKey,
  dim1.attributeTwo,
  dim1.attributeThree,
  dim2.attributeOne
FROM fact1
JOIN dim1 ON dim1.dim1key = fact1.dim1key
CROSS APPLY (
    SELECT attributeOne = STRING_AGG(dim2.attributeOne, ', ')
    FROM bridge b
    JOIN dim2 ON dim2.dim2key = b.dim2key
    WHERE b.groupKey = fact1.groupKey
) dim2

CodePudding user response:

If you don't have access to the STRING_AGG function in your version of SQL, you can use FOR XML PATH to achieve the same thing.

    CREATE TABLE #bridge (dim2Key int, groupKey int)
    INSERT  #bridge (dim2Key, groupKey) 
    VALUES (1, 1)
    ,(2, 1)
    ,(3, 1)
    ,(4, 2)

    CREATE TABLE #dim2 (dim2Key int, attributeOne varchar(5))
    INSERT  #dim2 (dim2Key, attributeOne)
    VALUES (1, 'A')
    ,(2, 'B')
    ,(3, 'C')
    ,(4, 'A')

    CREATE TABLE #dim1 (dim1Key int, attributeTwo int, attributeThree varchar(5))
    INSERT #dim1 (dim1Key, attributeTwo, attributeThree)
    VALUES (1, 35,   'val1') 
    ,(2, 25,   'val2')   
    ,(3, 45,   'val3')   
    ,(4, 55,   'val1')   

    CREATE TABLE #fact1 (dim1Key int, factvalue1 int, groupKey int)
    INSERT #fact1 (dim1Key, factvalue1, groupKey)
    VALUES (1,   5,  1)
    ,(2,   25, 1)
    ,(3,   55, -1)
    ,(4,   99, 2)
    GO
    

    ;WITH pvt (groupKey, attributeOne)
    AS 
    (
        SELECT b.groupKey, d2.attributeOne 
        FROM #dim2 d2
        JOIN #bridge b
        ON d2.dim2Key = b.dim2Key
    )
    , dim2 AS
    (
        SELECT DISTINCT a.groupKey
        ,LEFT(r.attributeOne , LEN(r.attributeOne)-1) attributeOne 
        FROM pvt a
        CROSS APPLY
        (
            SELECT attributeOne   ', '
            FROM pvt r
            WHERE a.groupKey = r.groupKey
            FOR XML PATH('')
        ) r (attributeOne)
    )



    SELECT f1.dim1Key, factvalue1, f1.groupKey, attributeTwo, attributeThree, attributeOne 
    FROM #fact1 f1
    LEFT JOIN #dim1 d1
    ON f1.dim1Key = d1.dim1Key
    LEFT JOIN dim2 d2
    ON f1.groupKey = d2.groupKey
  • Related