I am supposed to find the most common genre per country, and the total amount of movies that country makes. My output is just a query away from being correct (se example)
select count(c.filmid) as amountMovies, c.country as country, avg(r.rank) as avg rank, fgt.genre
from filmcountry c
full join filmrating r on c.filmid = r.filmid
left join (
select count(co.filmid) as no, co.country, fg.genre from filmcountry co
full join filmgenre fg on co.filmid = fg.filmid
group by co.country, fg.genre
order by co.country, no desc)
fgt on c.country = fgt.country
group by c.country, fgt.genre
order by c.country;
As you can see, I am able to see the amount of movies, country, av rang and there most famous genre (Adventure), but I am not able to only select the first line. Same with Albania.
I want it to be like this:
amountMovies | country | avg rank | genre
-------- -------------------------------- -------------------- -------------
29 | Afghanistan | 3.9962963086587413 | Adventure
874 | Albania | 7.149999976158142 | Music
This is what I am getting! See example below-->
amountMovies | country | avg rank | genre
-------- -------------------------------- -------------------- -------------
29 | Afghanistan | 3.9962963086587413 | Adventure
29 | Afghanistan | 3.9962963086587413 | Music
29 | Afghanistan | 3.9962963086587413 | Short
29 | Afghanistan | 3.9962963086587413 | Action
29 | Afghanistan | 3.9962963086587413 | Biography
29 | Afghanistan | 3.9962963086587413 | Documentary
29 | Afghanistan | 3.9962963086587413 | War
29 | Afghanistan | 3.9962963086587413 | Drama
29 | Afghanistan | 3.9962963086587413 |
874 | Albania | 7.149999976158142 | Music
874 | Albania | 7.149999976158142 | Documentary
874 | Albania | 7.149999976158142 | Drama
874 | Albania | 7.149999976158142 |
874 | Albania | 7.149999976158142 | Family
874 | Albania | 7.149999976158142 | Thriller
CodePudding user response:
Per country you want to get the total movie count and the genre(s) with the highest movie count in the country.
Count the movies per country and genre and rank the genres per country by count. Only keep the best ranked genre(s) per country. That's about it. Now either just display the rows or aggregate by country in order to only get one result row per country showing the genres in a list.
select
country,
string_agg(genre, ',') as top_genres,
any_value(movie_count) as country_movie_count
from
(
select
fc.country,
fg.genre,
rank() over (partition by fc.country order by count(*) desc) as genre_rnk,
sum(count(*)) over (partition by fc.country) as movie_count
from filmcountry fc
join filmgenre fg using (filmid)
group by fc.country, fg.genre
)
where genre_rnk = 1
group by country;