Home > database >  Selecting max value grouped by specific column
Selecting max value grouped by specific column

Time:12-22

Focused DB tables:

enter image description here

Task:

For given location ID and culture ID, get max(crop_yield.value) * culture_price.price (let's call this multiplication monetaryGain) grouped by year, so something like:

[
   {
      "year":2014,
      "monetaryGain":...
   },
   {
      "year":2015,
      "monetaryGain":...
   },
   {
      "year":2016,
      "monetaryGain":...
   },
   ...
]

Attempt:

SELECT cp.price * max(cy.value) AS monetaryGain, EXTRACT(YEAR FROM cy.date) AS year
FROM culture_price AS cp
JOIN culture AS c ON cp.id_culture = c.id
JOIN crop_yield AS cy ON cy.id_culture = c.id
WHERE c.id = :cultureId AND cy.id_location = :locationId AND cp.year = year
GROUP BY year
ORDER BY year

The problem:

"columns "cp.price", "cy.value" and "cy.date" must appear in the GROUP BY clause or be used in an aggregate function"

If I put these three columns in GROUP BY, I won't get expected result - It won't be grouped just by year obviously.

Does anyone have an idea on how to fix/write this query better in order to get task result?

Thanks in advance!

CodePudding user response:

The fix

Rewrite monetaryGain to be:

 max(cp.price * cy.value) AS monetaryGain 

That way you will not be required to group by cp.price because it is not outputted as an group member, but used in aggregate.

Why?

When you write GROUP BY query you can output only columns that are in GROUP BY list and aggregate function values. Well this is expected - you expect single row per group, but you may have several distinct values for the field that is not in grouping column list.

For the same reason you can not use a non grouping column(-s) in arithmetic or any other (not aggregate) function because this would lead in several results for in single row - there would not be a way to display.

This is VERY loose explanation but I hope will help to grasp the concept.

Aliases in GROUP BY

Also you should not use aliases in GROUP BY. Use:

GROUP BY EXTRACT(YEAR FROM cy.date) 

Using alias in GROUP BY is not allowed. This link might explain why: https://www.postgresql.org/message-id/[email protected]

  • Related