I'm having an issue where I need to connect a unique key from one table that has multiple matching keys in another table, but I only need the most recent entry for each key.
The issue arises as I'm using tags to identify animals, however over time I need to re-use the tags. To solve this I use a unique key for each tag where I include the year of birth in the ID of the animal.
I now have some new recordings where I identify the sex of each animal, but only the tag is recorded. I need to match the tag to the correct unique animal tag, however as most of the tags are re-used I need to ensure I'm only matching the tag to the newest version, i.e. the most recent year for the relevant tag in the unique_tag column.
Is there a way to tell SQL which entry to use when there are multiple matching keys?
Simplified example tables and current code:
table_a:
unique_tag |
---|
2020TAG1 |
2016TAG2 |
2004TAG1 |
2001TAG2 |
table_b:
tag | sample_date | sex |
---|---|---|
TAG1 | 2022-02-10 | M |
TAG2 | 2022-02-10 | F |
SELECT unique_tag, table_b.tag, table_b.sample_date, table_b.sex
FROM table_a
INNER JOIN table_b
ON RIGHT(table_a.unique_tag, 4) = table_b.tag;
This code results in multiple entries for each tag, however I only need the one with the most recent year in the unique_tag.
Any suggestions are greatly appreciated!
CodePudding user response:
You can use ORDER BY sample_date DESC LIMIT 1
this should give you one result where te date is the latest.
CodePudding user response:
You can use row_number()
and get the rows with the biggest year of tag for each tag:
SELECT unique_tag, tag, sample_date, sex FROM
(SELECT unique_tag, table_b.tag, table_b.sample_date, table_b.sex,
row_number() over(partition by RIGHT(table_a.unique_tag, 4) order by LEFT(table_a.unique_tag, 4) desc) rn
FROM table_a
INNER JOIN table_b
ON RIGHT(table_a.unique_tag, 4) = table_b.tag) t
WHERE rn = 1;
CodePudding user response:
You can use a subquery with IN()
condition:
SELECT unique_tag, table_b.tag, table_b.sample_date, table_b.sex
FROM table_a
INNER JOIN table_b
ON RIGHT(table_a.unique_tag, 4) = table_b.tag
WHERE table_b.tag IN( SELECT tag FROM table_b GROUP BY tag ORDER BY sample_date DESC);