I am trying to extract a string from a JSON in presto. Here are the 2 options for how the extracted data can look before I cast to a string:
JSON_EXTRACT(my_column, '$.my_column_extract')
= ["A::1","B::2","C::3","D::4"]JSON_EXTRACT(my_column, '$.my_column_extract')
= {"0":"A::1","1":"B::2","2":"C::3","3":"D::4,5"}
For both above options, the goal string format I want to achieve: For 1) A::1, B::2, C::3, D::4 For 2) A::1, B::2, C::3, D::4,5
So I cast as ARRAY(VARCHAR)
and then ARRAY_JOIN
to get a string. This works for 1) but not for 2).
- ARRAY_JOIN( CAST( JSON_EXTRACT(my_column, '$.my_column_extract') AS ARRAY(VARCHAR) ), ', ' ) = A::1, B::2, C::3, D::4
- The same functions result in an error due the format. Error notice (INVALID_CAST_ARGUMENT) Cannot cast to array(varchar). Expected a json array, but got {"0":"A::1","1":"B::2","2":"C::3","3":"D::4,5"}
How can I get 2) to result in the goal string format?
CodePudding user response:
If your records are available only in those two formats then you can leverage try_cast
function which returns null
if the cast can't be performed and use coalesce
two cast to the second format:
-- sample data
WITH dataset(my_column) AS (
values ('{"my_column_extract":["A::1","B::2","C::3","D::4"]}'),
('{"my_column_extract":{"0":"A::1","1":"B::2","2":"C::3","3":"D::4,5"}}')
),
-- query parts
extracted as (
select JSON_EXTRACT(my_column, '$.my_column_extract') extr
from dataset)
SELECT coalesce(
try_cast(extr as array(varchar)), -- try process array
map_values(cast(extr as map(varchar, varchar)))) -- if not array - process json object as map and get map values
FROM extracted;
Output:
_col0 |
---|
[A::1, B::2, C::3, D::4] |
[A::1, B::2, C::3, D::4,5] |
And then process resulting arrays as needed