I'm currently trying to turn an outer apply into a left join to save some complexity.
SELECT *
FROM fact_table h
OUTER APPLY (SELECT TOP 1
*
FROM dimension mcc WITH (NOLOCK)
WHERE h.product = mcc.product
AND h.country = mcc.country
AND mcc.date IN (SELECT MAX(date)
FROM dimension dd WITH (NOLOCK)
WHERE FORMAT(DATEADD(MONTH, -3, dd.date), 'yyyyMM') <= h.month_in_the_year
AND dd.product = h.product
AND dd.country = h.country)) a;
I basically use it to get the related data from Dimension linked with the latest data point that's earlier than 3 months ago.
I'm trying to turn it into a left join, but it's taking a lot more time since I don't filter the dimension before the join :
SELECT TOP 10
*
FROM fact_table h
LEFT JOIN dimension a ON h.product = a.product
AND h.country = a.country
AND a.pkid = (SELECT TOP 1
pkid
FROM dimension dd
WHERE FORMAT(DATEADD(MONTH, -3, dd.date), 'yyyyMM') <= h.month_in_the_year
ORDER BY date DESC);
Do you have an idea on how to turn it efficiently into a left join ?
CodePudding user response:
It looks like you can significantly simplify this query, by simply adding an ORDER BY
. I've also modified the date filter in order to leverage indexing properly.
SELECT *
FROM fact_table h
OUTER APPLY (
SELECT TOP 1 *
FROM dimension mcc
WHERE h.product = mcc.product
AND h.country = mcc.country
AND mcc.date < DATEADD(MONTH, 2, DATEFROMPARTS(LEFT(h.month_in_the_year, 4), RIGHT(h.month_in_the_year, 2), 1))
ORDER BY mcc.date DESC
) a;
To transform this into a LEFT JOIN
, you need to utilize row-numbering
SELECT *
FROM (
SELECT *,
rn = ROW_NUMBER() OVER (PARTITION BY h.PrimaryKeyColumn ORDER BY mcc.date)
FROM fact_table h
LEFT JOIN dimension mcc
ON h.product = mcc.product
AND h.country = mcc.country
AND mcc.date < DATEADD(MONTH, 2, DATEFROMPARTS(LEFT(h.month_in_the_year, 4), RIGHT(h.month_in_the_year, 2), 1))
) a
WHERE rn = 1;