SQL Version: MySQL 8.0 or SQL Server
SQL Fiddle: https://www.db-fiddle.com/f/wcHeXkcynUiYP3qzryoYJ7/6
I have a table of images and a table of tags that link to those images.
================================== ===================================================
| tb_images | | tb_imagetags |
================================== ===================================================
| f_imageID | f_imagefilename | | f_imagetagID | f_imagetagimage | f_imagetagname |
---------------------------------- ---------------------------------------------------
| 1 | 1.jpg | | 10 | 1 | November |
| 2 | 2.jpg | | 11 | 1 | 2021 |
| 3 | 3.jpg | | 12 | 2 | November |
================================== | 13 | 2 | 2020 |
| 14 | 3 | December |
| 15 | 3 | 2020 |
===================================================
I want to be able to pass (2) tags to the query and have it select only the images that match BOTH tags. For example, I want to pass November
and 2021
and have it return only 1.jpg
.
If I do something like this:
SELECT f_imageID, f_imagefilename
FROM tb_images
LEFT JOIN tb_imagetags
ON f_imagetagimage = f_imageID
WHERE f_imagetagname = 'November'
OR f_imagetagname = '2021'
But that returns:
f_imageID f_imagefilename
================================
1 1.jpg
1 1.jpg
2 2.jpg
How can I rewrite this query to only get images that match both tags?
CodePudding user response:
One method uses aggregation:
SELECT i.f_imageID, i.f_imagefilename
FROM tb_images i
INNER JOIN tb_imagetags it
ON it.f_imagetagimage = i.f_imageID
GROUP BY i.f_imageID, i.f_imagefilename
HAVING SUM(f_imagetagname = 'November') > 0 AND
SUM(f_imagetagname = '2021') > 0;
The idea is to aggregate by image and then assert that both November
and 2021
appear as tag values, across some records within each image group.
Here is your updated DB Fiddle.
CodePudding user response:
You can us EXISTS for that
SELECT DISTINCT f_imageID, f_imagefilename
FROM tb_images
LEFT JOIN tb_imagetags fi2
ON f_imagetagimage = f_imageID
WHERE f_imagetagname = 'November'
AND EXISTS(SELECT 1 FROM tb_imagetags Fi WHERE f_imagetagname = '2021' AND fi.f_imageID = fi2.f_imageID)
CodePudding user response:
The problem is your data is related across different rows. If all the data was in the same row then it'd be easy
SELECT * FROM blah WHERE month = nov and year = 2021
When it's in different rows you want to get both rows like you're doing..
..but then you only want those images for which there are two rows. If there is only one row (eg only Nov or only 2021) you don't want that
There are various ways to do it. One is to join the tags table to itself, having filtered one of the sides to just months and the other to just years
tb_imagetags tmonth
JOIN tb_imagetags tyear
ON
tmonth.f_imagetagname = 'November' AND
tyear.f_imagetagname = '2021' AND
tmonth.f_imagetagimage = tyear.f_imagetagimage
This would implicitly put month nov and year 2021 "on the same row" so only images with both those tags would appear in the join result..
..but probably the usual way we do such "across row" queries is to check the count after grouping them, or check that the min is x and the max is y, for example:
SELECT f_imageID, f_imagefilename
FROM tb_images
INNER JOIN tb_imagetags
ON f_imagetagimage = f_imageID
WHERE f_imagetagname = 'November'
OR f_imagetagname = '2021'
GROUP BY f_imageID
HAVING COUNT(*) = 2
Or
HAVING MIN(f_imagetagname) = '2021' AND MAX( f_imagetagname) = 'November'
The count thing works if the tag names are distinct. If you can double up November by accident then it'll pick those up too. The min max only works for two tags.. you can also use something like
HAVING SUM(CASE f_imagetagname WHEN 'November' THEN 1 WHEN '2021' THEN 2 END) = 3
And that's good for any number of criteria, you just go up in powers of two, so for 3 tags you case when 1,2,4 and demand the sum be 7, you could also go in powers of anything, like base 10.. go up in 1,10,100 and demand the sum be 111..
You can also ask that there exists a related row multiple times:
SELECT f_imageID, f_imagefilename
FROM tb_images
WHERE
EXISTS(SELECT null FROM tb_imagetags WHERE f_imagetagimage = f_imageID AND f_imagetagname = 'November')
AND
EXISTS(SELECT null FROM tb_imagetags WHERE f_imagetagimage = f_imageID AND f_imagetagname = '2021')
EXISTS returns true if there is a row that meets the criteria: his sql means "images where there is some tag row that is November and there is some(other) tag row that is 2021"
Whatever you do, you need to think of a way to group the data up across the N rows where it exists and then do something that means the rows as a group meet the criteria. That's a trick because we don't usually think in those set terms as humans, we tend to think more "row by row"