Home > Net >  DBT database error, Invalid NUMERIC value: nan
DBT database error, Invalid NUMERIC value: nan

Time:12-17

Hi I tried to merge two table in Bigquery but have an error.

Database Error in model fact_trips (models/core/fact_trips.sql)
    Invalid NUMERIC value: nan
    compiled Code at target/run/taxi_rides_ny/models/core/fact_trips.sql

I have used dbt_expectations.expect_column_values_to_be_of_type to check every numeric column and they are passed.

I have null in these columns but I think null is different from nan right?

Do anyone know how to check which column has invalid value? The error message only show there is an error but no clue on which column and how to fix it?

Here is my repo if you want more details. https://github.com/ulanliu/DE_zoomcamp/blob/main/w4_analytics_engineering/taxi_rides_ny/models/core/fact_trips.sql

CodePudding user response:

There are no explicit casts to numeric in this model, so the error is either in an upstream model or caused by the union of green and yellow.

stg_green_tripdata and stg_yellow_tripdata are both materialized as views, so it's possible that either (or both) of those models is throwing this error. dbt doesn't select from a view after it creates it, so data errors in views are suppressed until the view is queried, which in your case is during the construction of your fact_trips model.

I'd start debugging this by doing a simple select count(*) from stg_green_tripdata and select count(*) from stg_yellow_tripdata. If either (or both) of those error, then you can start to dig deeper. If both of those queries succeed, I would change the materialization on those models to tables and dbt run them again, to ensure the simpler queries didn't skip over the problematic cells with indexes.

If you can materialize both upstream models as tables without errors, then I'd examine the union more closely. When you select * from a union select * from b, you rely on a and b having the same number of columns with the same types in the same natural order. I'd take about half of the columns and enumerate them in a query select c, d, e, f from a union select c, d, e, f from b and see if that succeeds or fails. Keep adding or removing columns until you can isolate the column with a type mismatch.

  • Related