I am attempting to join three tables with the parent table to get an output of customer id, visit id, and visit date where the oldest date is selected if more than one date. All four tables contain the cust_id, visit_id, and visit_date variables, but visit_id is the linking variable. I named the four tables a, b, c, and d to simplify the code below (a is the parent table).
My current code below gets me closer to my desired output, but I'm not sure how to get one visit_date column with the oldest date by visit_id and am unsure on how to remove duplicates. Any help will be sincerely appreciated, and please let me know if I can clarify anything!
SELECT
a.cust_id,
a.visit_id,
a.visit_date,
b.visit_date,
c.visit_date,
d.visit_date
FROM a
LEFT JOIN b
ON a.visit_id = b.visit_id
LEFT JOIN c
ON a.visit_id = c.visit_id
LEFT JOIN d
ON a.visit_id = d.visit_id
WHERE a.visit_date IS NOT NULL
OR b.visit_date IS NOT NULL
OR c.visit_date IS NOT NULL
OR d.visit_date IS NOT NULL;
My current output looks like this:
cust_id | visit_id | visit_date | visit_date | visit_date | visit_date |
---|---|---|---|---|---|
001 | 001 | 2013-05-23 | |||
002 | 002 | 2013-06-03 | 2013-05-27 | ||
003 | 003 | 2013-06-05 | 2013-06-05 | 2013-06-05 | |
003 | 004 | 2013-07-09 | 2013-07-09 | 2013-07-06 |
My desired output looks like this:
cust_id | visit_id | visit_date |
---|---|---|
001 | 001 | 2013-05-23 |
002 | 002 | 2013-05-27 |
003 | 003 | 2013-06-05 |
003 | 004 | 2013-07-06 |
CodePudding user response:
You could use the scalar LEAST
function:
SELECT
a.cust_id,
a.visit_id,
LEAST(a.visit_date, b.visit_date, c.visit_date, d.visit_date) AS visit_date
FROM a
LEFT JOIN b
ON a.visit_id = b.visit_id
LEFT JOIN c
ON a.visit_id = c.visit_id
LEFT JOIN d
ON a.visit_id = d.visit_id
WHERE
a.visit_date IS NOT NULL OR
b.visit_date IS NOT NULL OR
c.visit_date IS NOT NULL OR
d.visit_date IS NOT NULL;