I have a database with a schema similar to:
CREATE TABLE IF NOT EXISTS song_artist
(song INTEGER, artist INTEGER,
FOREIGN KEY("song") REFERENCES "Songs"("song_id") ON DELETE CASCADE,
FOREIGN KEY("artist") REFERENCES "Artists"("artist_id") ON DELETE CASCADE);
CREATE TABLE IF NOT EXISTS Artists ("artist_id" INTEGER PRIMARY KEY,
"artist_name" TEXT);
CREATE TABLE IF NOT EXISTS Songs ("song_id" INTEGER PRIMARY KEY,
"song_name" TEXT);
Say they have the following data in them:
INSERT INTO Artists (artist_name) VALUES ("Bob");
INSERT INTO Artists (artist_name) VALUES ("Shawn");
INSERT INTO Songs (song_name) VALUES ("one song");
INSERT INTO Songs (song_name) VALUES ("another song");
INSERT INTO song_artist (artist, song) VALUES (1, 1);
INSERT INTO song_artist (artist, song) VALUES (2, 1);
INSERT INTO song_artist (artist, song) VALUES (2, 2);
Artists
:
artist_id | artist_name |
---|---|
1 | Bob |
2 | Shawn |
Songs
:
song_id | song_name |
---|---|
1 | one song |
2 | another song |
song_artist
:
song | artist |
---|---|
1 | 1 |
1 | 2 |
2 | 2 |
Here, both Bob
and Shawn
are involved in one song
, and only Shawn
is involved in another song
. Say I wanted to get the names of the songs where Shawn
is involved, but not Bob
. So, in this case it would be only another song
. However, as to be expected, this query doesn't work as it gives me both the songs:
SELECT song_name
FROM Songs
JOIN Artists ON Artists.artist_id = song_artist.artist
JOIN song_artist ON song_artist.song = Songs.song_id
WHERE Artists.artist_name = "Shawn"
AND Artists.artist_name != "Bob"
song_name | |
---|---|
1 | one song |
2 | another song |
This would be the result of the join table song_artist
having multiple records for the same song, so even if a record with Bob
's ID is excluded, there's still another record without Bob
's ID that gets included in the final result.
Here's how I've found a workaround to this:
SELECT song_name
FROM Songs
JOIN Artists ON Artists.artist_id = song_artist.artist
JOIN song_artist ON song_artist.song = Songs.song_id
WHERE Artists.artist_name = "Shawn"
AND Songs.song_id NOT IN
(SELECT song_id
FROM Songs
JOIN Artists ON Artists.artist_id = song_artist.artist
JOIN song_artist ON song_artist.song = Songs.song_id
WHERE Artists.artist_name == "Bob")
song_name | |
---|---|
1 | another song |
Now, while this works as expected, as one can imagine, this becomes a long winded query once you have a lot more than a single join table. I also assume this is a very un-optimized query. My question is, what is a better way to do this kind of query?
CodePudding user response:
First, add a WHERE
clause in your query which filters out any other artists than Shawn and Bob.
Then use conditional aggregation and in the HAVING
clause add the condition that Bob does not participate in the song (which leaves only Shawn):
SELECT s.*
FROM Songs s
JOIN song_artist sa ON sa.song = s.song_id
JOIN Artists a ON a.artist_id = sa.artist
WHERE a.artist_name IN ('Shawn', 'Bob')
GROUP BY s.song_id
HAVING SUM(a.artist_name = 'Bob') = 0;
See the demo.
CodePudding user response:
You can use NOT EXISTS'
to exclude Bob
SELECT song_name FROM Songs s JOIN song_artist sa ON sa.song = s.song_id JOIN Artists a ON a.artist_id = sa.artist WHERE a.artist_name = 'Shawn' AND NOT EXiSTS(SELECT 1 FROM song_artist WHERE artist = (SELECT artist_id FROM Artists WHERE Artists.artist_name = 'Bob') AND song = sa.song )
song_name another song
db<>fiddle here
CodePudding user response:
An exists query might be more effective in large sets and is more readable than aggregation tricks IMHO:
select * from Songs s
where
exists (select * from Song_Artist sa inner join
artists a on sa.artist = a.artist_id
where s.song_id = sa.song and a.artist_Name = 'Shawn')
and
not exists (select * from Song_Artist sa inner join
artists a on sa.artist = a.artist_id
where s.song_id = sa.song and a.artist_Name = 'Bob');