I have this SQL query:
SELECT
Runs.runID, Runs.runDateTime,
COUNT(Files.destFileID) AS FileCount
FROM
Runs
LEFT OUTER JOIN
Files ON Files.runID = Runs.runID
WHERE
FileCount > 0
GROUP BY
Runs.runID, Runs.runDateTime
ORDER BY
Runs.runDateTime
It runs fine and displays the expected result without the WHERE
line, but I do need to filter the resultant data as that would imply.
The error I'm getting is
Invalid Column Name 'FileCount'
after the WHERE
keyword.
I've been reading around and I can't find any resources online that include all of the elements I have.
CodePudding user response:
You can't reference a column by it's alias in the WHERE
, you would need to repeat the expression in the WHERE
. As, however, the expression you have is an aggregate, you actually need to put this in the HAVING
:
SELECT R.runID,
R.runDateTime,
COUNT(F.destFileID) AS FileCount
FROM dbo.Runs R
LEFT OUTER JOIN dbo.Files ON F.runID = R.runID
GROUP BY R.runID,
R.runDateTime
HAVING COUNT(F.destFileID) > 0
ORDER BY R.runDateTime;
To elaborate on why you can't reference the column by it's alias in the WHERE
(or HAVING
here), this is due to the Logical Processing Order of the SELECT statement, which shows that the WHERE
is 4th part processed, and the HAVING
7th, however, the SELECT
is the 8th processed part. As the SELECT
hasn't been processed when the WHERE
is processed it would be impossible for something defined in it [the SELECT
] to be referenced yet. In a programmatical sense, it would be like trying to reference a variable before you define it.
The order of operations, per the documentation is as follows (thought note this isn't always the case, but the documentation linked earlier covers this in further detail):
- FROM
- ON
- JOIN
- WHERE
- GROUP BY
- WITH CUBE or WITH ROLLUP
- HAVING
- SELECT
- DISTINCT
- ORDER BY
- TOP
CodePudding user response:
Thanks Larnu, I'll accept your answer when the 10 minutes have expired, but after posting my question, I found that I could just subquery the whole thing:
SELECT *
FROM (
SELECT Runs.runID ,Runs.runDateTime ,COUNT(Files.destFileID) AS FileCount
FROM Runs
LEFT OUTER JOIN Files ON Files.runID = Runs.runID
WHERE FileCount > 0
GROUP BY Runs.runID, Runs.runDateTime
ORDER BY Runs.runDateTime
)
AS results
WHERE FileCount > 0
ORDER BY runDateTime
This also works. Does anyone know which of these options are best practices, or the performance of each?