Home > other >  Get column names AND types using star macro in dbt
Get column names AND types using star macro in dbt

Time:06-24

Using the star macro, is there a way to also get the column data type (boolean, numerical, etc), in addition to the column name?

For example, this query uses the star macro to collect the column names from a reference table, saves it as an array variable column_names, and then I loop over this array and apply the max function to all the columns.

{% set column_names = star(
    from=ref_table,
    except=["a", "b", "c"],
    as_list=True)
%}

select 
    date_trunc('week', day) as week,
    name,

    {%- for col in column_names %}  
    max({{ col|lower }}) as {{ col | lower }}{%- if not loop.last %},{{ '\n  ' }}{% endif %}
    {%- endfor %}

from {{ ref('my_table_name') }}    
group by 1, 2

I would like to conditionally apply the max function to only boolean columns.

This might look something like

{%- for col in column_names %}  
    {% if is_boolean(col) %}  
    max({{ col|lower }}) as {{ col | lower }}{%- if not loop.last %},{{ '\n  ' }}{% endif %}
    {% endif %}
{%- endfor %}

but the problem is that the star macro passes the column names as a string, so it's not carrying any metadata with it.

How might I get the column data type here?

Data warehouse: Snowflake

CodePudding user response:

You can view the source for dbt_utils.star here

Under the hood, it uses dbt_utils.get_filtered_columns_in_relation. That macro also just returns column names. However! that macro uses the built-in adapter.get_columns_in_relation, which returns a list of Column objects, which have a dtype property.

So your code becomes:

{% set all_columns = adapter.get_columns_in_relation(
    ref("my_table")
) %}
{% set except_col_names=["a", "b", "c"] %}

select 
    date_trunc('week', day) as week,
    name,

    {%- for col in all_columns if col.column not in except_col_names %}  
    {% if col.dtype == 'bool' %}  
    max({{ col.column|lower }}) as {{ col.column | lower }}{%- if not loop.last %},{{ '\n  ' }}{% endif %}
    {% endif %}
    {%- endfor %}

from {{ ref('my_table_name') }}    
group by 1, 2
  • Related