Home > Back-end >  Return only max value from one column when all other columns are the same
Return only max value from one column when all other columns are the same

Time:12-10

I have a Snowflake query that currently outputs results as follows:

ID City Fruit Quantity
123 LA Banana 1
123 LA Strawberry 1
124 LA Banana 3
125 NY Apple 2

I would like it to return only one ID line containing the max value of the fruit, so in this case, it would look like:

ID City Fruit Quantity
123 LA Strawberry 1
124 LA Banana 3
125 NY Apple 2

I have tried doing ROW_NUMBER() OVER Partition but it only ends up selecting rows that don't have two identical IDs, I have tried select distinct but it doesn't choose distinct values for Fruit, and I have tried select max(Fruit) but I get an error saying that it's not a valid group by expression.

CodePudding user response:

Have you tried something like this?

select ID, City, Fruit, Quantity FROM TestData
    QUALIFY row_number() over (partition by ID, City order by Fruit DESC ) = 1;

 ----- ------ ------------ ---------- 
| ID  | CITY |   FRUIT    | QUANTITY |
 ----- ------ ------------ ---------- 
| 123 | LA   | Strawberry |        1 |
| 124 | LA   | Banana     |        3 |
| 125 | NY   | Apple      |        2 |
 ----- ------ ------------ ---------- 

CodePudding user response:

We can start using MAX and GROUP BY to fetch the id and their maximum fruit:

SELECT id, MAX (fruit) AS fruit
FROM fruits
GROUP BY id;

This can be used as subquery to also select the other columns for these rows:

SELECT id, city, fruit, quantity
FROM fruits
WHERE (id, fruit)
IN 
  (SELECT id,
  MAX (fruit) AS fruit
  FROM fruits
  GROUP BY id);

If it's really intended to group by the other columns, too (which seems strange to me, but maybe this is wanted), we can just extend this:

SELECT id, city, fruit, quantity
FROM fruits
WHERE (id, city, fruit, quantity)
IN (SELECT id, city,
       MAX (fruit) AS fruit,
       quantity
FROM   fruits
GROUP  BY id, city, quantity);

For the sample data in the question, both queries will give the same result:

ID City Fruit Quantity
123 LA Strawberry 1
124 LA Banana 3
125 NY Apple 2

Try out here: db<>fiddle

  • Related