SQLite: how to exclude records in a database with join tables?-CodePudding

I have a database with a schema similar to:

CREATE TABLE IF NOT EXISTS song_artist
  (song INTEGER, artist INTEGER,
   FOREIGN KEY("song") REFERENCES "Songs"("song_id") ON DELETE CASCADE,
   FOREIGN KEY("artist") REFERENCES "Artists"("artist_id") ON DELETE CASCADE);


CREATE TABLE IF NOT EXISTS Artists ("artist_id" INTEGER PRIMARY KEY,
                                                                "artist_name" TEXT);
CREATE TABLE IF NOT EXISTS Songs ("song_id" INTEGER PRIMARY KEY,
                                                            "song_name" TEXT);

Say they have the following data in them:

INSERT INTO Artists (artist_name) VALUES ("Bob");
INSERT INTO Artists (artist_name) VALUES ("Shawn");
INSERT INTO Songs (song_name) VALUES ("one song");
INSERT INTO Songs (song_name) VALUES ("another song");
INSERT INTO song_artist (artist, song) VALUES (1, 1);
INSERT INTO song_artist (artist, song) VALUES (2, 1);
INSERT INTO song_artist (artist, song) VALUES (2, 2);

Artists:

artist_id	artist_name
1	Bob
2	Shawn

Songs:

song_id	song_name
1	one song
2	another song

song_artist:

song	artist
1	1
1	2
2	2

Here, both Bob and Shawn are involved in one song, and only Shawn is involved in another song. Say I wanted to get the names of the songs where Shawn is involved, but not Bob. So, in this case it would be only another song. However, as to be expected, this query doesn't work as it gives me both the songs:

SELECT song_name
FROM Songs
JOIN Artists ON Artists.artist_id = song_artist.artist
JOIN song_artist ON song_artist.song = Songs.song_id
WHERE Artists.artist_name = "Shawn"
  AND Artists.artist_name != "Bob"

	song_name
1	one song
2	another song

This would be the result of the join table song_artist having multiple records for the same song, so even if a record with Bob's ID is excluded, there's still another record without Bob's ID that gets included in the final result.

Here's how I've found a workaround to this:

SELECT song_name
FROM Songs
JOIN Artists ON Artists.artist_id = song_artist.artist
JOIN song_artist ON song_artist.song = Songs.song_id
WHERE Artists.artist_name = "Shawn"
  AND Songs.song_id NOT IN
    (SELECT song_id
     FROM Songs
     JOIN Artists ON Artists.artist_id = song_artist.artist
     JOIN song_artist ON song_artist.song = Songs.song_id
     WHERE Artists.artist_name == "Bob")

	song_name
1	another song

Now, while this works as expected, as one can imagine, this becomes a long winded query once you have a lot more than a single join table. I also assume this is a very un-optimized query. My question is, what is a better way to do this kind of query?

CodePudding user response：

First, add a WHERE clause in your query which filters out any other artists than Shawn and Bob.
Then use conditional aggregation and in the HAVING clause add the condition that Bob does not participate in the song (which leaves only Shawn):

SELECT s.*
FROM Songs s
JOIN song_artist sa ON sa.song = s.song_id
JOIN Artists a ON a.artist_id = sa.artist
WHERE a.artist_name IN ('Shawn', 'Bob')
GROUP BY s.song_id
HAVING SUM(a.artist_name = 'Bob') = 0;

See the demo.

CodePudding user response：

You can use NOT EXISTS'to exclude Bob

SELECT song_name
FROM Songs s
JOIN song_artist sa ON sa.song = s.song_id
JOIN Artists a ON a.artist_id = sa.artist
WHERE a.artist_name = 'Shawn'
  AND NOT EXiSTS(SELECT 1 FROM song_artist 
  WHERE artist = (SELECT artist_id FROM Artists WHERE  Artists.artist_name = 'Bob') 
AND song = sa.song
)

song_name
another song

db<>fiddle here

CodePudding user response：

An exists query might be more effective in large sets and is more readable than aggregation tricks IMHO:

select * from Songs s
where 
exists (select * from Song_Artist sa inner join
   artists a on sa.artist = a.artist_id
   where s.song_id = sa.song and a.artist_Name = 'Shawn')
and   
not exists (select * from Song_Artist sa inner join
   artists a on sa.artist = a.artist_id
   where s.song_id = sa.song and a.artist_Name = 'Bob');

DBFiddle demo