Home > Software engineering >  How to incrementally take average in Oracle SQL Or Python
How to incrementally take average in Oracle SQL Or Python

Time:10-10

I am bit stuck with this not getting how do i proceed further

Assume I have below data in table

Note : All the below column fields [NAME , AGE , COURSE , MARKS] are set to VARCHAR2(50 CHAR)

NAME | AGE | COURSE | MARKS 
A1   | ABC | MCA    | 5 
B1   | XYZ | MBA    | 10
C1   | PQR | MMS    | 14
D1   | YST | CAT    | 18

Using below query I am able to get incremental sum , but not able to apply logic how to get incremental average

My Query :

select c1.* , SUM(MARKS) Over (Order by c1.NAME) AS CUM_COUNT from EMP c1 order by c1.NAME ASC;

Expected output :

NAME | AGE | COURSE | MARKS | CUM_COUNT | AVGS_CNT      |
A1   | ABC | MCA    | 5     | 5   | 5/1  = 5      |
B1   | XYZ | MBA    | 10    | 15  | 15/2 = 7.5    |
C1   | PQR | MMS    | 14    | 29  | 29/3 = 9.6    |
D1   | YST | CAT    | 18    | 47  | 47/4 = 11.75  |

Solution in oracle OR python appreciated

Note : I have date in Oracle

The reason behind asking this question as it was asked to me in interview

CodePudding user response:

You used analytic sum (over "order by ...") - use avg instead and you will get exactly what you need, with a single analytic function. You don't need anything else.

In your query, you succeeded in confusing yourself. You took the cumulative sum but you called it cum_count. You could also take a cumulative count with the count() function (analytic version, also ordered by "name"), and then divide the cumulative sum (which you called cum_count) by the actual cumulative count to get the running average. Using analytic avg directly is much simpler though.

Caution: (more advanced topic) In your table you have name and age etc. - nothing that is obviously unique (something that could be used as primary key). In analytic functions, if you don't specify the windowing clause and there are possible ties in the order (which is possible with name and age), rows that are "tied" are considered all at once. To break such ties, you could for example use an ordering clause like order by name, rownum. That way, when the name is the same, rownum will break ties and allow rows to be added to the computation one at a time.

CodePudding user response:

To complement @mathguy's answer, here the full query you need:

select
  e.*,
  sum(marks) over(order by name) as cum_count,
  1.0 * sum(marks) over(order by name) / count(*) over(order by name) as avgs_cnt
from emp e

Result:

 NAME  AGE  COURSE  MARKS  CUM_COUNT  AVGS_CNT         
 ----- ---- ------- ------ ---------- ---------------- 
 A1    ABC  MCA     5      5          5                
 B1    XYZ  MBA     10     15         7.5              
 C1    PQR  MMS     14     29         9.66666666666667 
 D1    YST  CAT     18     47         11.75            

It uses windows functions by qualifying SUM() and COUNT(*) with OVER(ORDER BY NAME).

See running example at db<>fiddle.

  • Related