Home > Net >  Redshift SQL: Is there a difference between `=` and `is` for boolean values?
Redshift SQL: Is there a difference between `=` and `is` for boolean values?

Time:08-11

In aws Redshift, is there a difference between boolean_column = true and boolean_column IS true?

The docs recommend "always checking Boolean values explicitly" https://docs.aws.amazon.com/redshift/latest/dg/r_Boolean_type.html

I have a redshift table with three boolean columns. The following query runs without error and returns the expected result (i.e. my_any_column is true if any of the three columns is true):

-- Does work
SELECT
    (
        boolean_column_01 = true
        OR boolean_column_02 = true
        OR boolean_column_03 = true
    ) AS my_any_column
FROM
    my_example_table;

However, if I change the query as follows, there is an error:

-- Does not work
SELECT
    (
        boolean_column_01 is true
        OR boolean_column_02 is true
        OR boolean_column_03 is true
    ) AS my_any_column
FROM
    my_example_table;

This is the error message of the query using is:

SQL Error [XX000]: ERROR: Assert
  Detail:
  -----------------------------------------------
  error:  Assert
  code:      1000
  context:   arg_func_info.is_nullable_ == fld->a_nullable() - arg_func_info.is_nullable_=0, fld->a_nullable()=1.
  query:     1008428
  location:  cg_expr_fn_builder.cpp:8993
  process:   padbmaster [pid=4919]
  -----------------------------------------------

The weirdest thing is that the following query does work without error:

-- Does work
SELECT
    (
        boolean_column_01 is true
        OR boolean_column_02 is true
        OR boolean_column_03 = true
    ) AS my_any_column
FROM
    my_example_table;

With 2 * is and 1 * = the query works (irrespective of the order), but as soon as there are 3 * is it fails.

As a workaround I can simply always use = instead of is, but I cannot find such a recommendation in the docs.

CodePudding user response:

IS true and = are not equivalent in Redshift. Per the Redshift docs

"You can use an IS comparison to check a Boolean value only as a predicate in the WHERE clause. You can't use the IS comparison with a Boolean value in the SELECT list."

See: https://docs.aws.amazon.com/redshift/latest/dg/r_Boolean_type.html (Yes this is the same page you linked)

As you likely expected this is (in part) due to the handling of NULLs. Conditions in the Redshift use trinary (three-valued) logic (true, false, unknown) and by performing an "IS" comparison the logic moves back to binary. Binary is better understood by most and thus explicit checks lead to fewer bugs (it is also clearer to those reading the code). See: https://docs.aws.amazon.com/redshift/latest/dg/r_logical_condition.html

  • Related