I am trying to join two tables that share the same individual ID (key
). The first table (a
) is a 'wide' table with many variables including key
and age
, and the second table (b
) is a 'long' table, including only the variables key
, diagnosis_number
, and diagnosis
, where each individual can have multiple values of diagnosis
.
I want to select columns key
, age
, and the primary diagnosis for individuals where:
diagnosis
= "a", "b", or "c" whendiagnosis_number
= 1 [the 'primary diagnosis']- AND
diagnosis
= "y" for any ofdiagnosis_number
= 2:20
I've tried:
SELECT main.key, main.age, diag.diagnosis
FROM a as main
INNER JOIN
(
SELECT prim.key, prim.diagnosis
FROM
(SELECT DISTICT key, diagnosis
FROM b
WHERE diagnosis IN ('a', 'b', 'c')
AND diagnosis_number = 1) as prim
INNER JOIN
(SELECT DISTICT key, diagnosis
FROM b
WHERE diagnosis = 'y'
AND diagnosis_number BETWEEN 2 AND 20) as sec
ON prim.key = sec.key) as diag
ON main.key = diag.key
CodePudding user response:
Think this could be solved without subqueries/SELECTs inside of joins:
SQL Server
SELECT a.key, a.age, b.diagnosis
FROM a
INNER JOIN b
ON b.key = a.key
WHERE b.diagnosis IN ('a', 'b', 'c')
AND b.diagnosis_code = 1
AND a.key IN (SELECT b1.key
FROM b AS b1
WHERE b1.diagnosis = 'y'
AND b1.diagnosis_code BETWEEN 2 AND 20)