I'm trying to average a set of columns and exclude rows with zero from the denominator by using a case statement within an AVG() function. I read that avg() excludes NULL results so am using the case statement to replace 0's with NULL's. However that's not what I find in practice - see code below. Can someone explain why this happens? If you can suggest a code tweak to achieve what I'm after that would also be fab. Thanks.
with a as
(
select 0 t1, 3 t2
)
, b as
(
select 6 t1, 0 t2
)
, c as -- building a small table containing test data.
(
select * from a
union all
select * from b
)
select sum(case when t2 = 0 then null else t2 end case when t1 = 0 then null else t1 end) r1
, avg(case when t2 = 0 then null else t2 end case when t1 = 0 then null else t1 end) r2
, avg(t1) r3
from c
What subquery c contains:
t1 | t2 |
---|---|
0 | 3 |
6 | 0 |
The actual result of my query:
r1 | r2 | r3 |
---|---|---|
NULL | NULL | 3 |
Column r2
is what I would like the result of my query to be: avg(3 null, null 6) = avg(3, 6) = 4.5
:
r1 | r2 | r3 |
---|---|---|
9 | 4.5 | 3 |
CodePudding user response:
Instead of setting the 0 values to null
, you can filter them out somewhere in between:
with a as
(
select cast(0 as float) t1, cast(3 as float) t2
)
, b as
(
select cast(6 as float) t1, cast(0 as float) t2
)
, c as -- building a small table containing test data.
(
select * from a where t1 > 0 or t2 > 0
union all
select * from b where t1 > 0 or t2 > 0
)
select sum(t2 t1) r1
, avg(t2 t1) r2
, avg(t1) r3
from c;
Output:
| r1 | r2 | r3 |
|---- ---- ----|
| 9 | 4.5| 3 |
*See how I convert numbers to float
type, in order to capture the floating point on r2
.
CodePudding user response:
If you want to use your logic (... AVG() excludes NULL results so am using the case statement to replace 0's with NULL's ...), you need an additional APPLY
operator and appropriate CASE
expressions:
;WITH a as (
select 0.0 t1, 3.0 t2
), b as (
select 6.0 t1, 0.0 t2
),
c as (
select * from a
union all
select * from b
)
SELECT
r1 = SUM(CASE WHEN a.[value] <> 0.0 THEN a.[value] END),
r2 = AVG(CASE WHEN a.[value] <> 0.0 THEN a.[value] END),
r3 = AVG(CASE WHEN a.[column] = 't1' THEN a.[value] END)
FROM c
CROSS APPLY (VALUES ('t1', c.t1), ('t2', c.t2)) a ([column], [value])
Result:
r1 | r2 | r3 |
---|---|---|
9.0 | 4.500000 | 3.000000 |
Warning: Null value is eliminated by an aggregate or other SET operation.