I've found a lot of questions in here but none seems to resolve it.
I do want to retrieve unique values with STRING_AGG()
in SQL Server without using the keyword WITH
.
This is my query:
SELECT DISTINCT
bld.Code AS building_code,
-- CONCAT(bld.Name, ' - via ', dir.Name) AS building_name,
STRING_AGG(CAST(buildings.evaluation AS NVARCHAR(MAX)), ', ') AS [data]
FROM
Dealer de
INNER JOIN
Road ro ON de.ID = ro.Dealer_ID
INNER JOIN
Direction dir ON ro.ID = dir.Road_ID
INNER JOIN
Building bld ON dir.ID = bld.Direction_ID
INNER JOIN
BuildingType bt ON bld.BuildingType_ID = bt.ID
INNER JOIN
StructuralUnit su ON bld.ID = su.Building_ID
INNER JOIN
Span sp ON su.ID = sp.StructuralUnit_ID
INNER JOIN
UnitElement ue ON sp.ID = ue.Span_ID
INNER JOIN
(SELECT DISTINCT TOP 8
bld.Code AS building_code,
CONCAT(bld.Name, ' - via ', dir.Name) AS building_name,
dir.Name AS direction,
ROUND(AVG(ins.Evaluation), 2) AS evaluation,
YEAR(ig.Date) AS year
FROM
Building bld
INNER JOIN
Direction dir ON bld.Direction_ID = dir.ID
INNER JOIN
Road ro ON dir.Road_ID = ro.ID
INNER JOIN
Dealer de ON ro.Dealer_ID = de.ID
INNER JOIN
StructuralUnit su ON bld.ID = su.Building_ID
INNER JOIN
Span sp ON su.ID = sp.StructuralUnit_ID
INNER JOIN
UnitElement ue ON sp.ID = ue.Span_ID
INNER JOIN
Inspection ins ON ue.ID = ins.UnitElement_ID
INNER JOIN
InspectionGroup ig ON ig.ID = ins.InspectionGroup_ID
WHERE
ue.Status = 3
AND de.ID = 1
AND YEAR(ig.Date) IN (2021, 2020, 2019, 2018)
AND ig.InspectionTypeModel_ID <> 3
GROUP BY
bld.Code, bld.Name, dir.Name, YEAR(ig.Date)
ORDER BY
bld.Code, YEAR(ig.Date)) buildings ON buildings.building_code = bld.Code
WHERE
ue.Status = 3
AND de.ID = 1
GROUP BY
bld.Code --, bld.Name, dir.Name
ORDER BY
bld.Code
The subquery (the one with the ROUND(AVG(ins.Evaluation), 2) AS evaluation
) produces the correct distinct values:
But the full query doesn't work the way I though (with the DISTINCT
and GROUP BY
I though values would be unique, just 4 per row)
My desired results would be like this (don't mind the 3 dots at the end, I just resized the screen to show you only 4 values, that's how it's supposed to be)
Any suggestion, without using WITH
?
CodePudding user response:
Your main issue is that you are cross joining to your subquery. Your subquery has 4 rows for each building code, but you are only joining on code, therefore for every row in your outer query, you are returning 4 extra rows because of the subquery. If your data varies this could go up to as many as 8 (because of the TOP 8 in the subquery, without this there's more or less no limit to the duplication).
As far as I can tell you can massively simplify the query, none of the outer query is required at all, and with a small change to the grouping on the subquery you can get the required data without all the additional joins. The TOP
also seems unnecessary, with the proper grouping you would only ever get a maximum 4 rows per code anyway (one per year in the IN
clause):
SELECT b.Code,
Data = STRING_AGG(b.evaluation, ',') WITHIN GROUP(ORDER BY b.Year)
FROM (
SELECT bld.Code,
Year = YEAR(ig.Date),
evaluation = ROUND(AVG(ins.Evaluation), 2)
FROM Building AS bld
INNER JOIN Direction AS dir
ON bld.Direction_ID = dir.ID
INNER JOIN Road AS ro
ON dir.Road_ID = ro.ID
INNER JOIN Dealer AS de
ON ro.Dealer_ID = de.ID
INNER JOIN StructuralUnit AS su
ON bld.ID = su.Building_ID
INNER JOIN Span AS sp
ON su.ID = sp.StructuralUnit_ID
INNER JOIN UnitElement AS ue
ON sp.ID = ue.Span_ID
INNER JOIN Inspection AS ins
ON ue.ID = ins.UnitElement_ID
INNER JOIN InspectionGroup AS ig
ON ig.ID = ins.InspectionGroup_ID
WHERE ue.Status = 3
AND de.ID = 1
AND YEAR(ig.Date) IN (2021, 2020, 2019, 2018)
AND ig.InspectionTypeModel_ID <> 3
GROUP BY
bld.Code, YEAR(ig.Date)
) AS b
GROUP BY b.Code;