I have the following query: that is trying to get pt_ids that have certain diagnosis codes OR certain outcome_codes. I have accounted for these conditions by including the specific codes I'd like in the WHERE clause. However, I would like to know if I can also do this using in INNER JOIN.
SELECT DISTINCT
pt_id
FROM #df
WHERE flag <> 1
AND (diagnosis IN
(
SELECT Code
FROM #df_codes
WHERE code_id = '20'
)
OR outcome_code IN
(
SELECT Code
FROM #df_codes
WHERE code_id = '25'
));
The above code is taking quite a while to run and I am wondering if writing it in the following way 1) is appropriate/viable and 2) will leader to a faster run time. I am essentially trying to join #df with #df_codes on either Code = diagnosis OR code = outcome_code
My Solution
SELECT DISTINCT
pt_id
FROM #df a
JOIN (select * from #df_codes where code_id = '20')b ON a.diagnosis = b.Code OR
(select * from #df_codes where code_id = '25')c ON a.outcome_code = c.Code
WHERE flag <> 1
CodePudding user response:
I would do it like this:
SELECT DISTINCT pt_id
FROM #df
INNER JOIN #df_codes ON (
#df.diagnosis = #df_codes.code
AND #df_codes.code_id = '20'
)
OR (
#df.outcome_code = #df_codes.code
AND #df_codes.code_id = '25'
)
WHERE flag <> 1
CodePudding user response:
Just a rough suggestion to create small temp tables and join it back to main table. I would suggest use the CTE or original table query and do following:
SELECT Code
into #x
FROM #df_codes
WHERE code_id = '20'
SELECT Code
into #y
FROM #df_codes
WHERE code_id = '25'
SELECT
pt_id
FROM #df
join #x x on x.Code = diagnosis
WHERE flag <> 1
Union
SELECT
pt_id
FROM #df
join #y y on y.Code = diagnosis
WHERE flag <> 1
Just replace #df by a CTE that way you can use the existing indexes.
CodePudding user response:
Use EXISTS instead:
SELECT DISTINCT
d.pt_id
FROM #df d
WHERE flag <> 1
AND EXISTS (SELECT *
FROM #df_codes c
WHERE (c.code = d.diagnosis AND c.code_id = '20')
OR (c.code = d.outcome_code AND c.code_id = '25');
If #df lists each patient a single time - then you don't need the DISTINCT. If you try using a join and a patient has both then you would get 2 rows - and then use DISTINCT to eliminate the 'duplicates'.