Hi I tried to merge two table in Bigquery but have an error.
Database Error in model fact_trips (models/core/fact_trips.sql)
Invalid NUMERIC value: nan
compiled Code at target/run/taxi_rides_ny/models/core/fact_trips.sql
I have used dbt_expectations.expect_column_values_to_be_of_type
to check every numeric column and they are passed.
I have null in these columns but I think null is different from nan right?
Do anyone know how to check which column has invalid value? The error message only show there is an error but no clue on which column and how to fix it?
Here is my repo if you want more details. https://github.com/ulanliu/DE_zoomcamp/blob/main/w4_analytics_engineering/taxi_rides_ny/models/core/fact_trips.sql
CodePudding user response:
There are no explicit casts to numeric
in this model, so the error is either in an upstream model or caused by the union of green
and yellow
.
stg_green_tripdata
and stg_yellow_tripdata
are both materialized as views, so it's possible that either (or both) of those models is throwing this error. dbt doesn't select
from a view after it creates it, so data errors in views are suppressed until the view is queried, which in your case is during the construction of your fact_trips
model.
I'd start debugging this by doing a simple select count(*) from stg_green_tripdata
and select count(*) from stg_yellow_tripdata
. If either (or both) of those error, then you can start to dig deeper. If both of those queries succeed, I would change the materialization on those models to tables and dbt run
them again, to ensure the simpler queries didn't skip over the problematic cells with indexes.
If you can materialize both upstream models as tables without errors, then I'd examine the union
more closely. When you select * from a union select * from b
, you rely on a
and b
having the same number of columns with the same types in the same natural order. I'd take about half of the columns and enumerate them in a query select c, d, e, f from a union select c, d, e, f from b
and see if that succeeds or fails. Keep adding or removing columns until you can isolate the column with a type mismatch.