Making rank() ignore null cases-CodePudding

I'm trying to figure out a way to make the rank() window function "skip" in its counting some rows with null values in a specific column. Pretty much, what I want is to count how many paid transactions there are, for each client, before each transaction/row.

I tried using case when inside the rank() and I got something similar to the results I expect, but still not quite what I need.

 ------------------------------------------------------- 
|                  What I need                          |
 ------------- ------ ---------- ----------------------- 
| CLIENT      | CODE | PAYMENT  | PAID_PURCHASES_SO_FAR |
| A           | 341  | 17/09/21 | 0                     |
| A           | 342  | 18/09/21 | 1                     |
| A           | 343  | (null)   | 2                     |
| A           | 344  | 18/09/21 | 2                     |
| A           | 345  | 19/09/21 | 3                     |
| A           | 346  | 19/09/21 | 4                     |
| A           | 347  | (null)   | 5                     |
| A           | 348  | 24/09/21 | 5                     |
| B           | 855  | (null)   | 0                     |
| B           | 856  | 20/09/21 | 0                     |
| B           | 857  | (null)   | 1                     |
 ------------- ------ ---------- ----------------------- 


- ------------------------------------------------------ 
 |                  What I got                          |
- ------------ ------ ---------- ----------------------- 
 | CLIENT     | CODE | PAYMENT  | PAID_PURCHASES_SO_FAR |
 | A          | 341  | 17/09/21 | 0                     |
 | A          | 342  | 18/09/22 | 1                     |
 | A          | 343  | (null)   | (null)                |
 | A          | 344  | 18/09/22 | 2                     |
 | A          | 345  | 19/09/22 | 3                     |
 | A          | 346  | 19/09/21 | 4                     |
 | A          | 347  | (null)   | (null)                |
 | A          | 348  | 24/09/21 | 5                     |
 | B          | 855  | (null)   | (null)                |
 | B          | 856  | 20/09/21 | 0                     |
 | B          | 857  | (null)   | (null)                |
- ------------ ------ ---------- -----------------------

In a single image: comparison

And here my code:

SELECT 
  CLIENT
, CODE
, PAYMENT
, CASE WHEN PAYMENT IS NOT NULL THEN DENSE_RANK() OVER(PARTITION BY CLIENT, (CASE WHEN PAYMENT IS NOT NULL THEN 1 ELSE 0 END) ORDER BY CODE) - 1 END NUMBER_OF_PURCHASES_SO_FAR 
FROM FOO.BAR

Note: The CODE column may be used as time reference. E.g. code = 750 came before code = 751, and so on.

Any help would be appreciated.

Thanks in advance.

CodePudding user response：

You can use conditional aggregation combined with a window frame, as in:

select *, coalesce(sum(case when payment is null then 0 else 1 end) 
            over(partition by client order by code 
                 rows between unbounded preceding and 1 preceding), 0)
  as ppsf
from t
order by client, code

Result:

 client  code  payment   ppsf 
 ------- ----- --------- ---- 
 A       341   17/09/21  0    
 A       342   18/09/21  1    
 A       343   null      2    
 A       344   18/09/21  2    
 A       345   19/09/21  3    
 A       346   19/09/21  4    
 A       347   null      5    
 A       348   24/09/21  5    
 B       855   null      0    
 B       856   20/09/21  0    
 B       857   null      1

See running example at db<>fiddle.

CodePudding user response：

This is it:

SELECT 
  "CLIENT"
, "CODE"
, "PAYMENT"
, rank() over(partition by "CLIENT" 
              order by COALESCE("PAYMENT",'01/01/70') )
FROM Table1

http://sqlfiddle.com/#!15/f941a/14