I'm trying to create a rank() over() but need it to return null when meeting a certain criteria. Then continue the rank on the next row.
Example of what i'm trying to accomplish is the column rank_over_except in this image.
It is ranking() over() the identifier and is ordered by original_nr column. In this case it doesn't "rank" when the fruit is a pear.
I have tried using a CASE WHEN THEN statement. But that is simply continuing the count. The null would simply replace a 2 on the 2nd row of rank_over_except in this example. And the third row would be a 3. So that's not working as expected.
I don't see any option to write the rank() over(). Maybe there's another way of doing this and not use rank()? I've gone through the BigQuery docs, but no luck in finding a solution.
CodePudding user response:
Starting from the original numbering, we can use arithmethics and a conditional count to compute the new rank:
select t.*,
original_nr - countif(fruit = 'apple') over(partition by identifier order by original) rank_over_except
from mytable t
A more canonical approach defines a partition which contains everything but apples with a CASE
expression:
select t.*,
case when fruit != 'apple'
then rank() over(
partition by identifier, case when fruit = 'apple' then 1 else 0 end
order by original_nr
)
end rank_over_except
from mytable t
order by id
Both solutions are more efficient than the union
approach, that requires scanning the table twice.
CodePudding user response:
The solution that worked fine for my case is based on the comment of @SR3142. Maybe with row_number @GMB's solution can also work, but the size of the data was neglegible in my case.
select
t.*,
ROW_NUMBER() OVER (PARTITION BY identifier ORDER BY original_nr ASC) rank_over_except
from table as t
WHERE t.fruit NOT IN ("pear", "apple")
UNION ALL
SELECT
NULL AS rank_over_except,
t.*
FROM table AS t
WHERE
t.fruit IN ("pear", "apple")
Reason to use row_number() over rank() is that row_number does not take in to account the data from the input column. So it's an actual "dumb" next in line number.