I am trying to run a query in Snowflake to convert a GeoJSON object into Snowflake's baked-in geospatial data types:
SELECT id, -- some other columns
TRY_TO_GEOGRAPHY(raw_row) -- raw row is my GeoJSON object
FROM ...
The Snowflake documentation for TRY_TO_GEOGRAPHY
says:
Parses an input and returns a value of type GEOGRAPHY.
This function is essentially identical to TO_GEOGRAPHY except that it returns NULL when TO_GEOGRAPHY would issue an error.
However when I run the query I get the following very uninformative error (no mention of column, record or even SQL line that produced the error):
Failed to cast variant value "null" to OBJECT
I pinpointed the TRY_TO_GEOGRAPHY
to be the cause since the query works if I comment that line. Shouldn't Snowflake return NULL
in this case? How can I fix the problem? And if there's faulty data in my table, is there any way for me to find the rows where the function fails (it has 9 digits row count, I can't do it manually)?
CodePudding user response:
Is your raw_row
of VARIANT
type or VARCHAR
?
The Snowflake documentation for TRY_TO_GEOGRAPHY
also mentions (emphasis mine)
<variant_expression>
The argument must be an OBJECT in GeoJSON format.
So if you try to parse a variant value that is not an "OBJECT in GeoJson format" it will indeed raise an error:
SELECT
TRY_TO_GEOGRAPHY('' :: VARCHAR), -- NULL
TRY_TO_GEOGRAPHY('' :: VARIANT), -- error!
TRY_TO_GEOGRAPHY('{"geometry": null, "type": "Feature"}' :: VARCHAR), -- NULL
TRY_TO_GEOGRAPHY('{"geometry": null, "type": "Feature"}' :: VARIANT); -- error!
However you can argue the documentation is not all that clear on this issue, and there are some cases where they both act the same. Surprisingly:
SELECT
-- just removing {"type": "Feature"} from the error above
TRY_TO_GEOGRAPHY('{"geometry": null}' :: VARIANT), -- NULL
-- just removing {"geometry": null} from the error above
-- a Feature without "geometry" and "properties" is ill-defined by the spec
TRY_TO_GEOGRAPHY('{"type": "Feature"}' :: VARIANT); -- NULL
So it's unclear if it's a bug in Snowflake or if it's intended behavior. Note TRY_TO_GEOGRAPHY
will work as expected if the object is a valid GeoJSON format but the geography itself is invalid. For example a polygon where edges cross each other will return null for the TRY_
version but fail with an error with TO_GEOGRAPHY
.
How to fix it: the simplest way is to convert the variant column to varchar, which is a bit silly but it works. You can then query the NULL values produced and check the GeoJSON values that produced an error, hoping it's not that many to manually check them.
CodePudding user response:
You may need to convert the object under key geometry.
The Geojson may have other keys, such as feature and properties. Your table can be designed with these 1st level keys as columns (in my case I just drop feature). In case the geometry is null, TRY_TO_GEOGRAPHY will return NULL. In case it is one of the others, there is no need to do a conversion (just use type OBJECT or VARIANT).
The problem seems to be getting a GEOGRAPHY data from a non-viable object as source (e.g., having the object under geometry as null).