I have such a situation. For some reason, when I join two temporary tables one column gets completely different results. Depending whether I group by LOC column from Actuals or Forecast, one of the column gives correct results and another one gets something totally weird.
The results are accurate for Actuals (1718789) but wrong for Forecast
WITH ACTUALS AS
(
SELECT [LOC], [DMDUNIT], [DMDPostDate],
SUM(HistoryQuantity) AS 'Actuals'
FROM SCPOMGR.HISTWIDE_CHAIN
GROUP BY [LOC], [DMDUNIT], [DMDPostDate]
),
Forecast AS
(
SELECT [LOC], [DMDUNIT], [STARTDATE],
SUM(TOTFCST) AS 'Forecast'
FROM SCPOMGR.FCSTPERFSTATIC
-- Forecast Albertsons 99484.136 (wrong: should be 122880.591)
GROUP BY [LOC], [DMDUNIT], [STARTDATE]
)
SELECT A.[LOC], SUM(A.Actuals) AS 'Actuals', SUM(F.Forecast) AS 'Forecast'
FROM Actuals A FULL OUTER JOIN Forecast F
on A.[DMDUNIT] = F.[DMDUNIT]
AND f.[STARTDATE] = a.[DMDPostDate]
and a.[LOC] = f.[LOC]
GROUP BY A.[LOC]
ORDER BY A.[LOC]
When I change GROUP BY from A.[LOC] to F.[LOC] I have the reverse effect. Now my Actuals are correct, but Forecast is not.
Something is wrong with the GROUP BY. Depending if I GROUP BY a.[LOC] I get a correct result for 'Actuals' but incorrect for 'Forecast', if I GROUP BY f.[LOC] I get a correct result for 'Forecast', but incorrect for 'Actuals'.
WITH ACTUALS AS
(
SELECT [LOC], [DMDUNIT], [DMDPostDate],
SUM(HistoryQuantity) AS 'Actuals'
FROM SCPOMGR.HISTWIDE_CHAIN
-- Actuals 80324 (Albertsons) grouped by F.[LOC] - incorrect (should be 1718789)
GROUP BY [LOC], [DMDUNIT], [DMDPostDate]
),
Forecast AS
(
SELECT [LOC], [DMDUNIT], [STARTDATE],
SUM(TOTFCST) AS 'Forecast'
FROM SCPOMGR.FCSTPERFSTATIC
-- Forecast Albertsons 122880.591 (Albertsons) grouped by f.[LOC] - correct
GROUP BY [LOC], [DMDUNIT], [STARTDATE]
)
SELECT F.[LOC], SUM(F.Forecast) AS 'Forecast', SUM(A.Actuals) AS 'Actuals'
FROM Forecast F FULL OUTER JOIN Actuals A
on F.[DMDUNIT] = A.[DMDUNIT]
AND F.[STARTDATE] = A.[DMDPostDate]
and F.[LOC] = A.[LOC]
GROUP BY F.[LOC]
ORDER BY F.[LOC]
It is technically the same code just GROUP BY throws off the results of one of the columns.
Does someone know how it can be fixed. I need to keep the JOIN on those 3 fields
on A.[DMDUNIT] = F.[DMDUNIT]
AND f.[STARTDATE] = a.[DMDPostDate]
and a.[LOC] = f.[LOC]
CodePudding user response:
You're using a FULL OUTER JOIN, so I assume there will be unmatched rows between the two tables and LOC will sometimes be NULL in either table. Use COALESCE to get the value from either table when one is NULL.
WITH ACTUALS AS
(
SELECT [LOC], [DMDUNIT], [DMDPostDate],
SUM(HistoryQuantity) AS 'Actuals'
FROM SCPOMGR.HISTWIDE_CHAIN
GROUP BY [LOC], [DMDUNIT], [DMDPostDate]
),
Forecast AS
(
SELECT [LOC], [DMDUNIT], [STARTDATE],
SUM(TOTFCST) AS 'Forecast'
FROM SCPOMGR.FCSTPERFSTATIC
-- Forecast Albertsons 99484.136 (wrong: should be 122880.591)
GROUP BY [LOC], [DMDUNIT], [STARTDATE]
)
SELECT COALESCE(a.[LOC], f.[LOC]) as LOC, SUM(A.Actuals) AS 'Actuals', SUM(F.Forecast) AS 'Forecast'
FROM Actuals A FULL OUTER JOIN Forecast F
on A.[DMDUNIT] = F.[DMDUNIT]
AND f.[STARTDATE] = a.[DMDPostDate]
and a.[LOC] = f.[LOC]
GROUP BY COALESCE(a.[LOC], f.[LOC])
ORDER BY LOC