Home > Mobile >  SQL Where clause order
SQL Where clause order

Time:12-17

I have this SQL query:

SELECT 
    Runs.runID, Runs.runDateTime,
    COUNT(Files.destFileID) AS FileCount
FROM
    Runs
LEFT OUTER JOIN 
    Files ON Files.runID = Runs.runID
WHERE 
    FileCount > 0
GROUP BY 
    Runs.runID, Runs.runDateTime
ORDER BY 
    Runs.runDateTime

It runs fine and displays the expected result without the WHERE line, but I do need to filter the resultant data as that would imply.

The error I'm getting is

Invalid Column Name 'FileCount'

after the WHERE keyword.

I've been reading around and I can't find any resources online that include all of the elements I have.

CodePudding user response:

You can't reference a column by it's alias in the WHERE, you would need to repeat the expression in the WHERE. As, however, the expression you have is an aggregate, you actually need to put this in the HAVING:

SELECT R.runID,
       R.runDateTime,
       COUNT(F.destFileID) AS FileCount
FROM dbo.Runs R
     LEFT OUTER JOIN dbo.Files ON F.runID = R.runID
GROUP BY R.runID,
         R.runDateTime
HAVING COUNT(F.destFileID) > 0
ORDER BY R.runDateTime;

To elaborate on why you can't reference the column by it's alias in the WHERE (or HAVING here), this is due to the Logical Processing Order of the SELECT statement, which shows that the WHERE is 4th part processed, and the HAVING 7th, however, the SELECT is the 8th processed part. As the SELECT hasn't been processed when the WHERE is processed it would be impossible for something defined in it [the SELECT] to be referenced yet. In a programmatical sense, it would be like trying to reference a variable before you define it.

The order of operations, per the documentation is as follows (thought note this isn't always the case, but the documentation linked earlier covers this in further detail):

  1. FROM
  2. ON
  3. JOIN
  4. WHERE
  5. GROUP BY
  6. WITH CUBE or WITH ROLLUP
  7. HAVING
  8. SELECT
  9. DISTINCT
  10. ORDER BY
  11. TOP

CodePudding user response:

Thanks Larnu, I'll accept your answer when the 10 minutes have expired, but after posting my question, I found that I could just subquery the whole thing:

SELECT *
FROM (
    SELECT Runs.runID ,Runs.runDateTime ,COUNT(Files.destFileID) AS FileCount
    FROM Runs
    LEFT OUTER JOIN Files ON Files.runID = Runs.runID
    WHERE FileCount > 0
    GROUP BY Runs.runID, Runs.runDateTime
    ORDER BY Runs.runDateTime
)
AS results
  WHERE FileCount > 0
  ORDER BY runDateTime

This also works. Does anyone know which of these options are best practices, or the performance of each?

  • Related