In aws Redshift, is there a difference between boolean_column = true
and boolean_column IS true
?
The docs recommend "always checking Boolean values explicitly" https://docs.aws.amazon.com/redshift/latest/dg/r_Boolean_type.html
I have a redshift table with three boolean columns. The following query runs without error and returns the expected result (i.e. my_any_column
is true
if any of the three columns is true):
-- Does work
SELECT
(
boolean_column_01 = true
OR boolean_column_02 = true
OR boolean_column_03 = true
) AS my_any_column
FROM
my_example_table;
However, if I change the query as follows, there is an error:
-- Does not work
SELECT
(
boolean_column_01 is true
OR boolean_column_02 is true
OR boolean_column_03 is true
) AS my_any_column
FROM
my_example_table;
This is the error message of the query using is
:
SQL Error [XX000]: ERROR: Assert
Detail:
-----------------------------------------------
error: Assert
code: 1000
context: arg_func_info.is_nullable_ == fld->a_nullable() - arg_func_info.is_nullable_=0, fld->a_nullable()=1.
query: 1008428
location: cg_expr_fn_builder.cpp:8993
process: padbmaster [pid=4919]
-----------------------------------------------
The weirdest thing is that the following query does work without error:
-- Does work
SELECT
(
boolean_column_01 is true
OR boolean_column_02 is true
OR boolean_column_03 = true
) AS my_any_column
FROM
my_example_table;
With 2 * is
and 1 * =
the query works (irrespective of the order), but as soon as there are 3 * is
it fails.
As a workaround I can simply always use =
instead of is
, but I cannot find such a recommendation in the docs.
CodePudding user response:
IS true and = are not equivalent in Redshift. Per the Redshift docs
"You can use an IS comparison to check a Boolean value only as a predicate in the WHERE clause. You can't use the IS comparison with a Boolean value in the SELECT list."
See: https://docs.aws.amazon.com/redshift/latest/dg/r_Boolean_type.html (Yes this is the same page you linked)
As you likely expected this is (in part) due to the handling of NULLs. Conditions in the Redshift use trinary (three-valued) logic (true, false, unknown) and by performing an "IS" comparison the logic moves back to binary. Binary is better understood by most and thus explicit checks lead to fewer bugs (it is also clearer to those reading the code). See: https://docs.aws.amazon.com/redshift/latest/dg/r_logical_condition.html