I have TWO table:
TABLE ALBUM
id | AlbumDescription |
---|---|
1 | Siamo Qui |
2 | Safari |
3 | Perfetto |
TABLE SONG
id | idAlbum | SongDescription | Duration |
---|---|---|---|
1 | 1 | AlbaChiara | 30 |
2 | 1 | Gabri | 40 |
3 | 2 | ombellico | 50 |
4 | 2 | I love you baby | 10 |
5 | 3 | cose della vita | 60 |
6 | 3 | aurora | 20 |
The tables are linked via Song.idAlbum and Album.id
Now I have to make a query that extrapolates the song that lasts longer for each album, displaying the name of the song as well
By doing this query I have duplicate records, what is the best approach to eliminate duplicate records?
SELECT MAX(Duration) as Duration
,A.AlbumDescription
,C.SongDescription
FROM Album A
left JOIN Canzone C ON A.id = C.idAlbum
GROUP BY A.AlbumDescription, C.SongDescription
Result:
Duration | AlbumDescription | SongDescription |
---|---|---|
30 | Siamo Qui | AlbaChiara |
60 | Perfetto | Cose della vita |
40 | Siamo Qui | Gabri |
10 | Safari | I love you baby |
20 | Perfetto | aurora |
50 | Safari | ombellico |
What would i expect
Duration | AlbumDescription | SongDescription |
---|---|---|
60 | Perfetto | Cose della vita |
40 | Siamo Qui | Gabri |
50 | Safari | ombellico |
CodePudding user response:
We can use RANK()
here:
WITH cte AS (
SELECT a.AlbumDescription, c.SongDescription, c.Duration,
RANK() OVER (PARTITION BY a.id ORDER BY c.Duration DESC) rnk
FROM Album a
INNER JOIN Canzone c ON c.idAlbum = a.id
)
SELECT Duration, AlbumDescription, SongDescription
FROM cte
WHERE rnk = 1;
Note that if you don't expect or care about the possibility for two or more songs on the same album having the same maximum length, then you may also use ROW_NUMBER
above instead of RANK
. As written, the above would return all songs on a given album tied for being the longest.
We can write a shorter version of the above using a TOP 1 WITH TIES
trick:
SELECT TOP 1 WITH TIES a.AlbumDescription, c.SongDescription, c.Duration
FROM Album a
INNER JOIN Canzone c ON c.idAlbum = a.id
ORDER BY RANK() OVER (PARTITION BY a.id ORDER BY c.Duration DESC);
But note that this probably will not perform as well as the first version, and it should only be used when you don't care much about performance but more about code brevity.
CodePudding user response:
When you use group by with both AlbumDescription, SongDescription it makes 2 groups. Because (Siamo Qui,AlbaChiara) and (Siamo Qui,Gabri) are different. As you want to find the song with the longest duration you should group it by AlbumDescription. You might want to use album id.
SELECT MAX(Duration) as Duration
,A.AlbumDescription
,C.SongDescription
FROM Album A
left JOIN song C ON A.id = C.idAlbum
GROUP BY A.AlbumDescription;
this query returns:
Duration | AlbumDescription | SongDescription |
---|---|---|
60 | Perfetto | Cose della vita |
40 | Siamo Qui | Gabri |
50 | Safari | ombellico |
Or you can use
SELECT FROM album as A
LEFT JOIN song AS S on a.id = s.idAlbum
WHERE S.duration == (SELECT max(duration)
FROM song WHERE idAlbum = S.idAlbum);
both return the same result.
These are the table in data in case anyone wants to try;
CREATE TABLE album
(
id INTEGER,
albumDescription TEXT
);
CREATE TABLE song
(
id INTEGER,
idAlbum INTEGER REFERENCES album(id),
songDescription TEXT,
duration INTEGER
);
INSERT INTO album VALUES
(1, 'Siamo Qui'),(2, 'Safari'),(3, 'Perfetto');
INSERT INTO song VALUES
(1,1, 'AlbaChiara',30),
(2,1, 'Gabri',40),
(3,2, 'ombellico',50),
(4,2, 'I love you baby',10),
(5,3, 'cose della vita',60),
(6,3, 'aurora',20);
CodePudding user response:
You can also put aggregation on sone description like this:
SELECT MAX(Duration) as Duration
,A.AlbumDescription
,MAX(C.SongDescription) as SongDescription
FROM Album A
left JOIN Canzone C ON A.id = C.idAlbum
GROUP BY A.AlbumDescription