Home > database >  SQL - How to do Window function if there is NULL value?
SQL - How to do Window function if there is NULL value?

Time:10-09

First, I have this information:

  1. Weight A
  2. Weight B
  3. Relationship B to A: 1 to Many

Such that, below result can be obtained:

Weight A Weight B B_Id
3 16 1
5 16 1
6 16 1
7 16 1
2 12 2
6 12 2

Now, adding two more columns: Sum Weight A By B_Id, Accumulative Difference (consider below as table t2)

Weight A Sum Weight A By B_Id Weight B B_Id Accumulative Diff
3 21 16 1 5
5 21 16 1 5
6 21 16 1 5
7 21 16 1 5
2 8 12 2 1
6 8 12 2 1

For example above,

  1. first row accumulative difference => 21 - 16 = 5

  2. the fifth row accumulative difference => (21 8) - (16 12) = 1

So, my objective, is to compute such 'Accumulative Difference' The entire result is to be displayed in the report.

Technically, by using 'Window Functions', this can be achieved without problem. First, I have to create 2 more columns: Accumulate Weight A By B_Id, Accumulate Weight B. Then, just find the difference between the two.

Weight A Sum Weight A By B_Id Weight B B_Id Accumulate Weight A By B_Id Accumulate Weight B Accumulative Diff
3 21 16 1 21 16 5
5 21 16 1 21 16 5
6 21 16 1 21 16 5
7 21 16 1 21 16 5
2 8 12 2 29 28 1
6 8 12 2 29 28 1

sample SQL (from t2):

SELECT 
 *,
 [Accumulate Weight A By B_Id] = SUM(WeightA) OVER (PARTITION BY ... ORDER BY B_Id),
 [Accumulate Weight B] = SUM(WeightB) OVER (PARTITION BY ... ORDER BY B_Id)
FROM t2
-- (...) could be by date year month
-- Accumulate Weight B can set to only 1st row, etc

Now, the problem comes, IF one of the B_Id is NULL. Consider below, note on the highlighted row

Weight A Sum Weight A By B_Id Weight B B_Id Accumulate Weight A By B_Id Accumulate Weight B Accumulative Diff
3 21 16 1 21 16 5
5 21 16 1 21 16 5
6 21 16 1 21 16 5
7 21 16 1 21 16 5
2 8 12 2 29 28 1
6 8 12 2 29 28 1
9 9 0 NULL 38 28 10
7 10 8 3 48 36 12
3 10 8 3 48 36 12

Since there is no corresponding B_Id (not data issue), the Weight B is 0. Theoretically, the data should be populated as usual. But, if we still order by B_Id, thing is not going to work. As all with NULL B_Id will be considered as one (hence, the sequence is impacted), which is not favorable.

So my question, how to handle such situation?

CodePudding user response:

You can use coalesce().

SELECT 
 *,
 [Accumulate Weight A By B_Id] = SUM(WeightA) OVER (PARTITION BY B_id ORDER BY B_Id),
 [Accumulate Weight B] = SUM(WeightB) OVER (PARTITION BY B_id ORDER BY B_Id),
 SUM(coalesce(WeightA,0)-coalesce(WeightB,0)) OVER (PARTITION BY B_id ORDER BY B_Id) difference

FROM t2

PS: Actually your initial query looks wrong to me, if that was correct then this would do. Probably you should give sample data of A and B. To me it makes more sense to sum() before joining them at all.

CodePudding user response:

You will have to make changes but this should help.`

SELECT [Accumulate Weight A By B_Id] = SUM(WeightA) OVER (
        PARTITION BY...ORDER BY B_Id
        )
    ,[Accumulate Weight B] = SUM(WeightB) OVER (
        PARTITION BY...ORDER BY B_Id
        )
FROM t2
WHERE B_Id IS NOT NULL

UNION

SELECT [Accumulate Weight A By B_Id] = SUM(TAB.WeightA) OVER (
        PARTITION BY TAB.ROW_NUM ORDER BY B_Id
        )
    ,[Accumulate Weight B] = SUM(TAB.WeightB) OVER (
        PARTITION BY TAB.ROW_NUM ORDER BY B_Id
        )
FROM (
    SELECT WeightA
        ,WeightB
        ,B_Id
        ,ROW_NUMBER() OVER (
            ORDER BY B_ID
            ) AS ROW_NUM
    FROM T2
    WHERE B_ID IS NULL
    ) TAB

`

  • Related