I'm using mysql. I have two tables, one is about movie type, and the other is about movie rating with timestamps. I want to join these two tables together with movie id to count the average rating for each type of movie. I'm trying to extract only the movie types which have at least 10 ratings per film and the ratings made in December, and order by the highest to lowest average rating.
Table 'types'
movieId | type |
---|---|
1 | Drama |
2 | Adventure |
3 | Comedy |
... | ... |
Table 'ratings'
movieId | rating | timestamp |
---|---|---|
1 | 1 | 851786086 |
2 | 1.5 | 1114306148 |
1 | 2 | 1228946388 |
3 | 2 | 850723898 |
1 | 2.5 | 1167422234 |
2 | 2.5 | 1291654669 |
1 | 3 | 851345204 |
2 | 3 | 944978286 |
3 | 3 | 965088579 |
3 | 3 | 1012598088 |
1 | 3.5 | 1291598726 |
1 | 4 | 1291779829 |
1 | 4 | 850021197 |
2 | 4 | 945362514 |
1 | 4.5 | 1072836909 |
1 | 5 | 881166397 |
1 | 5 | 944892273 |
2 | 5 | 1012598088 |
... | ... | ... |
Expect result: (Nb ratings >= 10 and rate given in December)
type | Avg_Rating |
---|---|
Drama | 3.45 |
I'm trying to write the query like below, but I'm not able to execute it. (around 10 thousand data in original table) Where should I adjust my query?
SELECT DISTINCT T.type, AVG(R.rating) FROM types AS T
INNER JOIN ratings AS R ON T.movieId = R.movieId
WHERE R.timestamp LIKE (
SELECT FROM_UNIXTIME(R.timestamp,'%M') AS Month FROM ratings
GROUP BY Month
HAVING Month = 'December')
GROUP BY T.type
HAVING COUNT(R.rating) >=10
ORDER BY AVG(R.rating) DESC;
CodePudding user response:
You can try next query.
SELECT DISTINCT T.type, AVG(R.rating) FROM types AS T
INNER JOIN ratings AS R ON T.movieId = R.movieId
GROUP BY T.type
HAVING
COUNT(R.rating) >= 10 -- have 10 or more rating records
AND SUM(MONTH(FROM_UNIXTIME(R.timestamp)) = 12) > 0 -- have at least one rating in December
ORDER BY AVG(R.rating) DESC;
CodePudding user response:
I see two problems:
timestamp LIKE - what's that supposed to do? and
inner query with GROUP BY by without any aggregation. Perhaps you meant WHERE? And anyway you don't need it at all - just do the same check for December directly on timestamp, w/o LIKE and w/o subquery
SELECT DISTINCT T.type, AVG(R.rating) FROM types AS T INNER JOIN ratings AS R ON T.movieId = R.movieId WHERE FROM_UNIXTIME(R.timestamp,'%M') = 'December' GROUP BY T.type HAVING COUNT(R.rating) >=10 ORDER BY AVG(R.rating) DESC;