Home > Mobile >  Why are calculations included in a GROUP BY in SQL?
Why are calculations included in a GROUP BY in SQL?

Time:02-10

Is there ever a reason to include a function in the GROUP BY?

I see in in code and even in some instructions but have never been explained why, outside of it's easier to copy from the SELECT then typing it again.

Example:

SELECT 
    ProductName,
    IFF (UnitsInStock > 0, 'true', 'false) AS InStock
    COUNT(ProductID)
FROM Products
GROUP BY ProductName, IFF (UnitsInStock > 0, 'true', 'false)

I have started removing the functions in all my coding:

GROUP BY ProductName, UnitsInStock

Is there a reason to keep the full expression?

CodePudding user response:

grouping by IFF (UnitsInStock > 0, 'true', 'false) produces -at most- two rows per product, grouping by `UnitsInStock ' produces row per unique value of the UnitsInStock.

CodePudding user response:

There are many articles out there (and training material to boot) about this subject, but the explanation is fairly straightforward really.

Keep in mind that (T-) SQL is not Turing-proof, it does not actually compiles to code - it transcribes your input - and therefor what comes in must go out - so if you input something SQL cannot "find" by itself, you need to reproduce it.

You can not GROUP BY InStock, because there is no InStock to use : you use a temporary "variable" created by the function (it is not a variable, that's a lie, but for sake of argument keep that metaphor in mind).

EDIT: I want to point out @userMT has a good point, when you actually remove that code there is a good chance you might get more problems then you're bargaining for

CodePudding user response:

It depends on the RDBMS you're using and for some of those what settings you use, however the SQL standard allows grouping by column numbers:

SELECT 
    ProductName,
    IFF (UnitsInStock > 0, 'true', 'false) AS InStock
    COUNT(ProductID)
FROM Products
GROUP BY 1, 2

Most RBDMSs support this, and this is how I always code.

It's completely safe because if you get the numbers wrong (too many, too few, etc), the statement will cause a syntax error, and referring to the columns by their expression adds no value.

  • Related