I want to replicate a simple case-when statement with a jinja block in dbt. How can I achieve this?
Here is my statement:
CASE status
WHEN 0 THEN 'pending'
WHEN 1 THEN 'ordered'
WHEN 2 THEN 'shipped'
WHEN 3 THEN 'received'
WHEN 4 THEN 'delivered'
ELSE NULL
END as status_mapping
CodePudding user response:
You can use a macro to insert reusable SQL snippets across different queries, which is one possible reason you might want to do this.
You could define the macro as follows:
-- yourproject/macros/status_mapping.sql
{% macro status_mapping(status) %}
CASE {{ status }}
WHEN 0 THEN 'pending'
WHEN 1 THEN 'ordered'
WHEN 2 THEN 'shipped'
WHEN 3 THEN 'received'
WHEN 4 THEN 'delivered'
ELSE NULL
END
{% endmacro %}
(I have kept the definition flexible)
... and call it in a model e.g. as follows:
-- yourproject/models/base/base__orders.sql
SELECT
order_id,
status_code,
{{ status_mapping('status_code') }} AS status
FROM
{{ source('your_dataset', 'orders') }}
Note the use of quotes around the field name, same as with the built-in source
macro two lines below. By including the field name as a macro argument instead of hard-coding it (and keeping the aliasing AS status
outside the macro) you allow yourself flexibility to change things in future.
This would then be compiled when you run DBT to something like:
SELECT
order_id,
status_code,
CASE status_code
WHEN 0 THEN 'pending'
WHEN 1 THEN 'ordered'
WHEN 2 THEN 'shipped'
WHEN 3 THEN 'received'
WHEN 4 THEN 'delivered'
ELSE NULL
END AS status
FROM
your_dataset.orders
CodePudding user response:
You have a couple options. First, you can define the mappings in an array or dict (if the ids are not a sequence) and loop through it to produce the full case statement:
{% set statuses = ['pending', 'ordered', 'shipped', 'received', 'delivered'] %}
CASE STATUS
{% for status in statuses %}
WHEN {{ loop.index - 1 }} THEN '{{ status }}'
{% endfor %}
ELSE NULL END STATUS_MAPPING
The other option is to put the mappings into a CSV, load it as a seed data file in DBT (https://docs.getdbt.com/docs/build/seeds), then join with the seed data as a ref
.
Create a file called status_mappings.csv
:
status_code,status
0,pending
1,ordered
2,shipped
3,received
4,delivered
Run dbt seed
, then add
WITH STATUS_MAPPINGS AS (
SELECT * FROM {{ ref('status_mappings') }}
}
SELECT S.STATUS
FROM MY_TABLE T1
JOIN STATUS_MAPPINGS SM ON T1.STATUS_CODE = SM.STATUS_CODE