Using CASE WHEN with newly created columns-CodePudding

I have a table in SQL Server, I've grouped it by ID and created 2 new columns with the counts of other data. I want to be able to create another column at the same time, which displays a 1 if the counts in both columns are greater than a number, otherwise will display 2. However when I try it, it says invalid column name, I guess as my count columns aren't in the original table?

My data is similar to:

    ID    Data1    Data2
-------------------------
0    1       1       5
1    1       2       5
2    1       5       8
3    1       7       9
4    2       8       5
5    2       7       3
6    2       9       2
7    3       3       1
8    3       3       6
9    3       2       7
10   3       6       3
11   3       8       0

Desired output (where code is 1 if >= 4, else 2):

   ID  CountData1  CountData2  Code
------------------------------------
0   1           4           4     1
1   2           3           3     2
2   3           5           5     1

Current query:

SELECT 
    ID,
    COUNT(Data1) AS CountData1,
    COUNT(Data2) AS CountData2,
    (CASE WHEN (CountData1 >= 4 and CountData2 >= 4) THEN 1 ELSE 2 END) AS Code
FROM 
    Table
GROUP BY 
    ID

CodePudding user response：

The way that SQL statements are parsed, you cannot reference an expression you just created at the same scope.

You can either repeat the expressions again:

SELECT 
    ID,
    COUNT(Data1) AS CountData1,
    COUNT(Data2) AS CountData2,
    (CASE WHEN (COUNT(Data1) >= 4 and COUNT(Data2) >= 4) THEN 1 ELSE 2 END) 
    AS Code
FROM 
    dbo.Table
GROUP BY 
    ID;

Or use a CTE or derived table:

-- CTE

;WITH cte AS 
(
  SELECT ID, 
      COUNT(Data1) AS CountData1,
      COUNT(Data2) AS CountData2
    FROM dbo.Table 
    GROUP BY ID
)
SELECT ID, CountData1, CountData2,
  CASE WHEN CountData1 >= 4 AND CountData2 >- 4 
    THEN 1 ELSE 2 END AS Code
FROM cte;

-- Derived Table

SELECT ID, CountData1, CountData2,
  CASE WHEN CountData1 >= 4 AND CountData2 >- 4 
    THEN 1 ELSE 2 END AS Code
FROM     
(
  SELECT ID, 
      COUNT(Data1) AS CountData1,
      COUNT(Data2) AS CountData2
    FROM dbo.Table 
    GROUP BY ID
) AS DerivedTable;

These all perform the same, in spite of some folks who believe the first one is worse because you reference COUNT() additional times. SQL Server is pretty good about not repeating work it doesn't have to, and in fact all three queries above yield the exact same execution plan, with the exact same cost, same number of reads, the same output, and the exact same number of expressions are calculated. CPU and duration will vary slightly because, well, computers.

If you have a case where the CTE yields a better plan than repeating the expressions (noting that there are possibly some cases where both variations will yield multiple count expressions), please post it somewhere.

CodePudding user response：

SELECT TT.ID,TT.CountData1,TT.CountData2,
CASE WHEN TT.CountData1>= 4 and TT.CountData2>=4 THEN 1 ELSE 2 END CODE
FROM
(SELECT ID,
COUNT(Data1) AS CountData1,
COUNT(Data2) AS CountData2
FROM 
   Table
GROUP BY 
 ID)TT;