Home > Software engineering >  SQL AVG with several category
SQL AVG with several category

Time:03-19

I have simple a table1 with

columnA     columnB
5            a
10           b
20           c

My goal to get the avg result on new columnC like this

columnA     columnB     columnC
5            a           .14
10           b           .29
20           c           .57

My query but did not get avg (each row columnA / total columnA)

select avg(columnA) as columnC
from table1

Not sure how to fix this. Thank you

CodePudding user response:

I'm assuming your required results are correct, because in that case you are not looking for an average, you are looking for a proportion of total.

So you want to divide your columnA value by the total for columnA, for which you can use a window function to obtain e.g.

declare @MyTable table (columnA int, columnB varchar(1));

insert into @MyTable (columnA, columnB)
    values
    (5, 'a'),
    (10, 'b'),
    (20, 'c');

select *
    , convert(decimal(9,2), columnA * 1.0 / sum(columnA * 1.0) over ()) columnC
from @MyTable;

Returns:

columnA columnB columnC
5 a 0.14
10 b 0.29
20 c 0.57

Note: Because your values are integers you need to convert them to decimal (*1.0) in order to get a decimal result. And then I have added a further convert to restrict the decimal places to 2.

CodePudding user response:

What you want is called 'proportion of total', one way to do it:

select columnA, columnB, cast(columnA/SumOfColumnA as decimal(15,2)) as columnC
from table1
     cross join
     (
       select sum(cast(columnA as float)) as SumOfColumnA from table1
     ) as SumCA

or, without cross join

select 
      columnA
    , columnB 
    , cast(columnA /
          (select sum(cast(columnA as float)) as SumOfColumnA from table1) as decimal(15,2)
          ) as columnC
from table1

Challenge accepted @DaleK (just kidding, you piqued my curiosity): I created a real table, and inserted 100K rows similar to the ones in the example here, and did an actual execution plan compare. When submitted as a batch, the query with windowing function reported the actual cost as 67% of the batch (ie it performed worse). Admittedly I didn't do anymore to see why.

Execution plan compare

The real life scenarios maybe quite different, of course.

  • Related