Home > Back-end >  How to find the max of the sum for each group in SQL?
How to find the max of the sum for each group in SQL?

Time:11-28

I have a database with tables ARTIST and SONG. Each song has a number of reproductions, an album associated and the artist_id that owns it. I want to get for each artist, the album that has the highest number of reproductions counting al of its songs. The tables are something like this:

Artist:

Artist_Id Name
1 Ignacio Guitar
2 Rosalia
3 Makande

Song:

Artist_Id Name N_reproductions Name_album
1 Song1 10 Album1
1 Song2 15 Album1
1 Song3 13 Album1
1 Song4 20 Album2
1 Song5 12 Album2
1 Song6 25 Album2
2 Song7 17 Album3
2 Song8 21 Album3
2 Song9 20 Album4
2 Song10 25 Album4
2 Song11 31 Album4

So the result I want to get would be

Name Name_album
Ignacio Guitar Album2
Rosalia Album4

So far I've tried this:

SELECT a.name, s.name_alb
  FROM artist a
  JOIN song s
    ON (a.artist_id = s.artist_id)
 GROUP BY a.artist_id, a.name, s.name_alb
HAVING SUM(s.n_reproductions) = (SELECT MAX(SUM(s1.n_reproductions))
                                   FROM song s1
                                  WHERE s1.artist_id = a.artist_id
                                    AND s1.name_alb = s.name_alb
                                  GROUP BY s1.artist_id, s1.name_alb);

but this returns every album from every artist instead.

CodePudding user response:

In your subquery you are looking at one album with

WHERE S1.ARTIST_ID = A.ARTIST_ID AND S1.NAME_ALB = S.NAME_ALB

Of this you get the

SUM(S1.N_REPRODUCTIONS)

which is one value in spite of

GROUP BY S1.ARTIST_ID, S1.NAME_ALB)

because it is only one album you are looking at.

Oracle allows another aggregation directly without having to write a subquery. You use this to get the maximum sum, but as there is only one sum, you just get the album's reproductions count. So you end up with

HAVING <album's reproduction count> = <album's reproduction count>

The easiest approach to get an artist's album(s) with the maximum reproduction sum is probably to get that maximum via MAX OVER. Then you can just filter those rows and only keep albums that have that maximum sum.

SELECT a.name, r.name_alb
FROM
(
  SELECT
    artist_id, name_alb,
    SUM(n_reproductions) AS sum_repros,
    MAX(SUM(n_reproductions)) OVER (PARTITION BY artist_id) AS max_sum_repros
  FROM song
  GROUP BY artist_id, name_alb
) r
JOIN artist a ON a.artist_id = r.artist_id
WHERE r.sum_repros = r.max_sum_repros
ORDER BY a.name, r.name_alb;

The same aggregated for the case an artist has two or more top albums:

SELECT a.name, LISTAGG(r.name_alb, ', ') WITHIN GROUP (ORDER BY r.name_alb) AS albums
FROM
(
  SELECT
    artist_id, name_alb,
    SUM(n_reproductions) AS sum_repros,
    MAX(SUM(n_reproductions)) OVER (PARTITION BY artist_id) AS max_sum_repros
  FROM song
  GROUP BY artist_id, name_alb
) r
JOIN artist a ON a.artist_id = r.artist_id
WHERE r.sum_repros = r.max_sum_repros
GROUP BY a.name
ORDER BY a.name;

Demo: https://dbfiddle.uk/b5JI5erl

CodePudding user response:

From Oracle 12, you can use:

SELECT MAX(a.name) AS name,
       s.name_album
FROM   artist a
       INNER JOIN song s
       ON (a.artist_id = s.artist_id)
GROUP BY a.artist_id, s.name_album
ORDER BY
       DENSE_RANK() OVER (
         PARTITION BY a.artist_id
         ORDER BY SUM(n_reproductions) DESC
       )
FETCH FIRST ROW WITH TIES;

Which, for the sample data:

CREATE TABLE Artist (Artist_Id, Name) AS
SELECT 1, 'Ignacio Guitar' FROM DUAL UNION ALL
SELECT 2, 'Rosalia' FROM DUAL UNION ALL
SELECT 3, 'Makande' FROM DUAL;

CREATE TABLE Song (Artist_Id, Name, N_reproductions, Name_album) AS
SELECT 1, 'Song1',  10, 'Album1' FROM DUAL UNION ALL
SELECT 1, 'Song2',  15, 'Album1' FROM DUAL UNION ALL
SELECT 1, 'Song3',  13, 'Album1' FROM DUAL UNION ALL
SELECT 1, 'Song4',  20, 'Album2' FROM DUAL UNION ALL
SELECT 1, 'Song5',  12, 'Album2' FROM DUAL UNION ALL
SELECT 1, 'Song6',  25, 'Album2' FROM DUAL UNION ALL
SELECT 2, 'Song7',  17, 'Album3' FROM DUAL UNION ALL
SELECT 2, 'Song8',  21, 'Album3' FROM DUAL UNION ALL
SELECT 2, 'Song9',  20, 'Album4' FROM DUAL UNION ALL
SELECT 2, 'Song10', 25, 'Album4' FROM DUAL UNION ALL
SELECT 2, 'Song11', 31, 'Album4' FROM DUAL

Outputs:

NAME NAME_ALBUM
Ignacio Guitar Album2
Rosalia Album4

fiddle

  • Related