I get the column "measurementResults.value" must appear in the GROUP BY clause or be used in an aggregate function
error for this query:
SELECT avg("measurementResults"."value") AS "value",
min("measurement"."timestamp") AS "timestamp",
min("testProtocolItemResults"."ranking") AS "ranking"
FROM "measurement"
JOIN "measurementResults" ON "measurementResults"."measurement" = "measurement"."id"
JOIN "conditions" ON "conditions"."measurement" = "measurement"."id"
JOIN "testProtocolItemResults" ON "testProtocolItemResults"."id" = "measurementResults"."testProtocolItemResults"
JOIN "testProtocolSessionItem" ON "testProtocolSessionItem"."id" = "measurement"."testProtocolSessionItem"
WHERE "measurement"."athlete" = 334
AND "measurementResults"."testProtocolItemResults" = 1
AND "conditions"."conditions" = '6'
GROUP BY "testProtocolSessionItem"."testProtocolSession",
"testProtocolItemResults"."ranking"
ORDER BY (CASE
WHEN "ranking" = 'greater'::text THEN "value"
ELSE NULL::double precision
END) DESC
LIMIT 3
If I change the ORDER BY clause like this, it works as expected:
...
ORDER BY "value" DESC
...
What am I doing wrong?
CodePudding user response:
Your problem has a couple of roots:
Most importantly, don't use the same name for an output column that is distinct from an input column (of the same name). That's a loaded foot-gun.
Secondly, make it a habit to table-qualify all columns used in a complex query involving multiple tables. Even if that seems to work, it might already be doing something else than you think. And even if it works correctly it may break later, if any column names are changed (added, removed renamed). With some bad luck it breaks silently, and your query happily ever after produces nonsense.
Thirdly, the SQL standard, which has somewhat confusing visibility rules. See:
In your working alternative query, "value"
resolves to the output column "value"
, which hides any input column of the same name in ORDER BY
. That works as expected (that is, if you actually meant to target the output column).
In your failing query, "value"
resolves to the input column "measurementResults.value"
. You cannot throw output columns into a new computation in ORDER BY
, you can only use them "as is". So, with output columns out of the way, "value"
resolves to the input column (now not hidden any more). And that leads to the reported error. Obviously, you cannot order by an input column after aggregating - except if you grouped by it, directly or indirectly.
You could repair your query with:
ORDER BY (ranking = 'greater') IS TRUE, "value" DESC
The sorts all rows where ranking = 'greater'
is not true to the top - like your CASE
expression would. So treating null
and false
alike.
Subtle difference: Those leading rows are sorted by value
, while your original would list them in arbitrary order. May or may not be welcome.
I assume you are aware that null
values sort on top in descending order? And that you can change that? See:
If that's not good enough (or for more complex expressions), you must be more verbose and explicit: one way is to wrap the whole query into a subquery, and order (and limit!) in the outer SELECT
:
SELECT avg_value, min_timestamp, min_ranking
FROM (
SELECT ir.ranking -- !
, avg(mr."value") AS avg_value -- !
, min(m."timestamp") AS min_timestamp -- !
, min(ir.ranking) AS min_ranking -- !
FROM measurement m
JOIN "measurementResults" mr ON mr.measurement = m.id
JOIN conditions c ON c.measurement = m.id
JOIN "testProtocolItemResults" ir ON ir.id = mr."testProtocolItemResults"
JOIN "testProtocolSessionItem" si ON si.id = m."testProtocolSessionItem"
WHERE m."athlete" = 334
AND mr."testProtocolItemResults" = 1
AND c."conditions" = '6'
GROUP BY si."testProtocolSession", ir.ranking
) sub
ORDER BY CASE WHEN ranking = 'greater' THEN "value" END DESC
LIMIT 3
Especially for queries with a small LIMIT
, this may be more expensive if Postgres cannot optimize the query plan as well any more.
Aside:
Use legal, loser-case identifiers, so you don't have to double-quote.
And use table aliases to de-noise your big queries.