I have been exploring dbt tools and I came across the following code snippet :
coalesce(customer_orders.number_of_orders, 0) as number_of_orders
I understand that a coalesce function is used to return the first non-null value in a list. What I do not understand is what does the zero in the second parameter signify?
CodePudding user response:
The integer 0 is not NULL, so it may be the first non-NULL argument, if customer_orders.number_of_orders
is NULL. Therefore the 0 serves as a default.
CodePudding user response:
COALESCE
can use as many arguments as you want. In most cases (like in your example), COALESCE(some_column,0)
is used to prevent that creating a sum or building an average will not lead to the desired result.
Assume there are three columns and you want to sum them. In case you don't use COALESCE
, the sum will be NULL
even if only one of your three columns is NULL
. So you will use COALESCE
and replace NULL
values by zero in order to receive the sum of all NOT NULL
values.
You can "translate" COALESCE
into a CASE WHEN
construct:
COALESCE(column1,0)
does following:
CASE WHEN column1 IS NULL THEN 0 ELSE column1 END
Another use case of COALESCE
is to replace column1 by column2 if column1 is NULL
, if also column2 is NULL
, take column3 etc.
I created an example here, so you can see what I mean:
db<>fiddle
CodePudding user response:
The COALESCE function returns the first non-null value in a list. COALESCE
can take n number of arguments.
COALESCE(val1, val2, ...., val_n)
So according to the query:
coalesce(customer_orders.number_of_orders, 0) as number_of_orders
In case customer_orders.number_of_orders
is NULL
the result returned in number_of_orders
would be 0.