I would like to join multiple dimension tables to a fact table. However, instead of returning all the results, I want to avoid a one-to-many relationship result. Instead, I would like to access just some of the data in one table, concatenate all of its findings, and return them into a single column so that the expected result of 1 row per fact remains. I do not want a one-to-many relationship in my result.
If I pair this answer How to concatenate text from multiple rows into a single text string in SQL Server with additional columns of interest, I get an error, saying: is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause
-- sample values
-- bridge
dim2Key, groupKey
1, 1
2, 1
3, 1
4, 2
-- dim2
dim2Key, attributeOne
1, 'A'
2, 'B'
3, 'C'
4, 'A'
-- dim1
dim1Key, attributeTwo, attributeThree,
1, 35, 'val1'
2, 25, 'val2'
3, 45, 'val3'
4, 55, 'val1'
-- fact1
dim1Key, factvalue1, groupKey,
1, 5, 1
2, 25, 1
3, 55, -1
4, 99, 2
-- expected values
-- fact1
dim1Key, factvalue1, groupKey, attributeTwo, attributeThree, attributeOne
1, 5, 1, 35, 'val1', 'A, B, C'
...
4, 99, 2, 55, 'val1', 'A'
CodePudding user response:
It's very unclear what your schema and joins should be, but it seems you want to aggregate dim2
per row of fact1
.
You can aggregate dim2
inside a correlated subquery. It's often nicer to put this in an APPLY
but you could also put it directly in the SELECT
SELECT
fact1.dim1Key,
fact1.factvalue1,
fact1.groupKey,
dim1.attributeTwo,
dim1.attributeThree,
dim2.attributeOne
FROM fact1
JOIN dim1 ON dim1.dim1key = fact1.dim1key
CROSS APPLY (
SELECT attributeOne = STRING_AGG(dim2.attributeOne, ', ')
FROM bridge b
JOIN dim2 ON dim2.dim2key = b.dim2key
WHERE b.groupKey = fact1.groupKey
) dim2
CodePudding user response:
If you don't have access to the STRING_AGG function in your version of SQL, you can use FOR XML PATH to achieve the same thing.
CREATE TABLE #bridge (dim2Key int, groupKey int)
INSERT #bridge (dim2Key, groupKey)
VALUES (1, 1)
,(2, 1)
,(3, 1)
,(4, 2)
CREATE TABLE #dim2 (dim2Key int, attributeOne varchar(5))
INSERT #dim2 (dim2Key, attributeOne)
VALUES (1, 'A')
,(2, 'B')
,(3, 'C')
,(4, 'A')
CREATE TABLE #dim1 (dim1Key int, attributeTwo int, attributeThree varchar(5))
INSERT #dim1 (dim1Key, attributeTwo, attributeThree)
VALUES (1, 35, 'val1')
,(2, 25, 'val2')
,(3, 45, 'val3')
,(4, 55, 'val1')
CREATE TABLE #fact1 (dim1Key int, factvalue1 int, groupKey int)
INSERT #fact1 (dim1Key, factvalue1, groupKey)
VALUES (1, 5, 1)
,(2, 25, 1)
,(3, 55, -1)
,(4, 99, 2)
GO
;WITH pvt (groupKey, attributeOne)
AS
(
SELECT b.groupKey, d2.attributeOne
FROM #dim2 d2
JOIN #bridge b
ON d2.dim2Key = b.dim2Key
)
, dim2 AS
(
SELECT DISTINCT a.groupKey
,LEFT(r.attributeOne , LEN(r.attributeOne)-1) attributeOne
FROM pvt a
CROSS APPLY
(
SELECT attributeOne ', '
FROM pvt r
WHERE a.groupKey = r.groupKey
FOR XML PATH('')
) r (attributeOne)
)
SELECT f1.dim1Key, factvalue1, f1.groupKey, attributeTwo, attributeThree, attributeOne
FROM #fact1 f1
LEFT JOIN #dim1 d1
ON f1.dim1Key = d1.dim1Key
LEFT JOIN dim2 d2
ON f1.groupKey = d2.groupKey