Home > Back-end >  How would I select all rows that contain a specific genre and aggregate the columns?
How would I select all rows that contain a specific genre and aggregate the columns?

Time:12-04

Here is a very simple database schema I made on the fly where the primary entity: movie, has two many-to-many relationships with genres, actors, along with their respective bridge tables:

database schema

Here is query I am using get all movies and all the actors and genres belonging to those movies (note: the actual actors and genres are not accurate as they've been randomized to serve as quick mock data).

SELECT 
    movie.movie_id, 
    string_agg(DISTINCT movie_name, ',') AS "movies", 
    string_agg(DISTINCT genre.genre_name, ',') AS "genres",
    string_agg(DISTINCT actor.actor_name, ',') AS "actors"
FROM movie
INNER JOIN movie_genre ON movie.movie_id = movie_genre.movie_id
INNER JOIN genre ON movie_genre.genre_id = genre.genre_id
INNER JOIN movie_actor ON movie.movie_id = movie_actor.movie_id
INNER JOIN actor ON movie_actor.actor_id = actor.actor_id
GROUP BY movie.movie_id;

Example of result:
result

Now what I want to do is search for all movies that contain a specific genre such as action but I also want to include the other genres as well in the result. How would I accomplish this in the most efficient way?

Here is what I've tried so far:

SELECT 
    movie.movie_id, 
    string_agg(DISTINCT movie_name, ',') AS "movies", 
    string_agg(DISTINCT genre.genre_name, ',') AS "genres",
    string_agg(DISTINCT actor.actor_name, ',') AS "actors"
FROM movie
INNER JOIN movie_genre ON movie.movie_id = movie_genre.movie_id
INNER JOIN genre ON movie_genre.genre_id = genre.genre_id
INNER JOIN movie_actor ON movie.movie_id = movie_actor.movie_id
INNER JOIN actor ON movie_actor.actor_id = actor.actor_id
WHERE EXISTS (SELECT 1 
              FROM movie_genre
              WHERE LOWER(genre_name) = 'action')
GROUP BY movie.movie_id;

While this works to filter the data, it excludes all of the genres except for the one inside the where clause as seen in the image below: only one genre

How would I instead get something like the below instead when filtering and how would I make it as efficient as possible? What I want: (multiple genres showing under the genres column after searching) :
result

CodePudding user response:

To achieve the result you want, you can use a combination of CASE statements and GROUP BY to filter the genres.

Here is one way to do this:

SELECT 
    movie.movie_id, 
    string_agg(DISTINCT movie_name, ',') AS "movies", 
    string_agg(DISTINCT 
        CASE
            WHEN LOWER(genre.genre_name) = 'action' THEN genre.genre_name
            ELSE NULL
        END, ',') AS "action_genres",
    string_agg(DISTINCT 
        CASE
            WHEN LOWER(genre.genre_name) != 'action' THEN genre.genre_name
            ELSE NULL
        END, ',') AS "other_genres",
    string_agg(DISTINCT actor.actor_name, ',') AS "actors"
FROM movie
INNER JOIN movie_genre ON movie.movie_id = movie_genre.movie_id
INNER JOIN genre ON movie_genre.genre_id = genre.genre_id
INNER JOIN movie_actor ON movie.movie_id = movie_actor.movie_id
INNER JOIN actor ON movie_actor.actor_id = actor.actor_id
GROUP BY movie.movie_id;

This will return a result set with two columns for genres, one for action genres and one for other genres. This way, you can filter the genres based on the action genre, while still maintaining the other genres in the result set.

You can further optimize the query by using a subquery to filter the movies by the action genre before joining to the other tables. This will reduce the amount of data that needs to be processed by the rest of the query, which can improve its performance. Here is an example of how you can do this:

SELECT 
    movie.movie_id, 
    string_agg(DISTINCT movie_name, ',') AS "movies", 
    string_agg(DISTINCT genre.genre_name, ',') AS "genres",
    string_agg(DISTINCT actor.actor_name, ',') AS "actors"
FROM (
    SELECT movie_id
    FROM movie_genre
    WHERE LOWER(genre_name) = 'action'
) AS action_movies
INNER JOIN movie ON action_movies.movie_id = movie.movie_id
INNER JOIN movie_genre ON movie.movie_id = movie_genre.movie_id
INNER JOIN genre ON movie_genre.genre_id = genre.genre_id
INNER JOIN movie_actor ON movie.movie_id = movie_actor.movie_id
INNER JOIN actor ON movie_actor.actor_id = actor.actor_id
GROUP BY movie.movie_id;

This query will first select only the movies that have the action genre, and then join to the other tables to get the full information for those movies. This can help improve the performance of the query, especially if there are a large number of movies in the database.

  • Related