I'm trying to use GREATEST()
in Snowflake, but whenever I have null values, I get null
instead of the desired result:
select greatest(1,2,null);
-- null
This behavior has confused many, and it begins with the behavior of GREATEST()
in Oracle, which Snowflake matches:
-
Asking here to get the best available solution.
CodePudding user response:
One solution could be to create a UDF that picks the
greatest()
or the first non null:create or replace function greatest2(x1 float, x2 float) returns float as $$ coalesce(greatest(x1, x2), x1, x2) $$; select greatest2(a, b) from some_nulls;
However things get more complex if you need to compare multiple values. For example, if you want to compare 3 columns, then you have to create a custom UDF with 3 arguments and check each for null:
create or replace function greatest3(x1 float, x2 float, x3 float) returns float as $$ select iff(x='-inf', null, x) from ( select greatest(nvl(x1, '-inf'), nvl(x2, '-inf'), nvl(x3, '-inf')) x ) $$; select greatest3(a, b, c) from some_nulls;
CodePudding user response:
Documenting here an approach that doesn't work (to save others time, or an opportunity to fix): Arrays in a SQL UDF.
create or replace function greatest_a(arr array) returns float immutable as $$ select max(value::float) from table(flatten(arr)) $$; select greatest_a([null,2,3.3]) from some_nulls;
This works until you try to create the array with values from the table.
select greatest_a([a, b, c]) from some_nulls; -- Unsupported subquery type cannot be evaluated
A similar approach with a JS UDF would work, but it will be slower than a pure SQL UDF.
CodePudding user response:
Welcome user: Felipe
The doc's seem to describe the results exactly:
Returns the largest value from a list of expressions. If any of the argument values is NULL, the result is NULL. GREATEST supports all data types, including VARIANT.
If you ORDER BY data,
NULL
is last, GREATEST is returning the LAST value..Snowflake is rather consistent
select a,b,c ,greatest(a, b) as g_a_b ,greatest(a, c) as g_a_c ,greatest(b, c) as g_b_c from values (1.1, 2.3, null::float), (null, 2, 3.5), (1, null, 3), (null, null, null) t(a,b,c)
gives:
A B C G_A_B G_A_C G_B_C 1.1 2.3 null 2.3 null null null 2 3.5 null null 3.5 1 null 3 null 3 null null null null null null null so your min value
-inf
solution is interesting/gross, I mean it's correct.But what GREATEST is doing is handling a fixed number of SQL declared columns, so the NVL soltuion works: But to make a generic solution, building the fix arrays, and then flattening, and the maxing as max handles NULL in the implicit way the problem implies which "them not being selected"
select a,b,c, max(f.value) from ( select a,b,c ,array_construct_compact(a, b, c) as aa from values (1.1, 2.3, null), (null, 2, 3.5), (1, null, 3), (null, null, null) t(a,b,c) ), table(flatten(input=>aa)) as f group by 1,2,3,f.seq
gives:
A B C MAX(F.VALUE) 1 null 3 3 null 2 3.5 3.5 1.1 2.3 null 2.3 CodePudding user response:
It is possible to handle nullable columns with ARRAYs:
SELECT a,b,c, GREATEST([a],[b],[c])[0]::INT FROM some_nulls;
For sample data:
CREATE OR REPLACE TABLE some_nulls(a INT, b INT, c INT) AS SELECT 1, 2, NULL UNION SELECT NULL, 2, 3 UNION SELECT 1, NULL, 3 UNION SELECT NULL, NULL, NULL;
Output:
How it works:
NULL
becomes[undefined]
which is the lowest element, therefore this approach could be used forGREATEST
but NOT forLEAST
function.