I'm just playing around with SQL and I'm trying to do the following.
I have 2 tables and here is their structure:
As there are many ratings for one movie, what I'd like to do is get the avg rating per movie and have it display next to the title which is only available in the Metadata table.
This is as far as I got but obviously the issue with my SELECT statement is that it'll return the average of all movies and display it for each record:
SELECT
(SELECT
AVG(rating)
FROM
`movies-dataset.movies_data.ratings`) AS rating_avg,
metadata.title,
metadata.budget,
metadata.revenue,
metadata.genres,
metadata.original_language,
metadata.release_date
FROM
`movies-dataset.movies_data.Movies_metadata` AS metadata
INNER JOIN `movies-dataset.movies_data.ratings` AS ratings
ON metadata.id = ratings.movieId
LIMIT 10
Here is an example of the result:
I'm thinking I can potentially use a GROUP BY but when I try, I get an error
Appreciate the help!
CodePudding user response:
The following should work:
SELECT movies_metadata.title, AVG(ratings.rating)
FROM movies_metadata
LEFT JOIN ratings ON movies_metadata.id = ratings.movieID
GROUP BY movies_metadata.title
You can swap movies_metadata.title
by movies_metadata.id
if not unique.
CodePudding user response:
The LIMIT function and GROUP function might conflict with each other. Try getting the average rating as part of the inner join like this:
SELECT
ratings.averagerating,
metadata.title,
metadata.budget,
metadata.revenue,
metadata.genres,
metadata.original_language,
metadata.release_date
FROM `movies-dataset.movies_data.Movies_metadata` AS metadata
INNER JOIN (SELECT movieId, AVG(rating) averagerating FROM `movies-dataset.movies_data.ratings` GROUP by movieId) AS ratings
ON metadata.id = ratings.movieId
ORDER BY ratings.averagerating
LIMIT 5
CodePudding user response:
Maybe try something like:
Select m.movieID, (r.rate_sum / r.num_rate) as avg_rating From your_movies_table m Left Join (select movie_id, sum(rating) as ‘rate_sum’, count(rating) as ‘num_rate’ From your_ratings_table Group by movie_id) r On m.movie_id = r.movie_id
I'm using a left join because I'm not sure if all movies have been rated at least once.