Home > front end >  CAST array or struct as a string
CAST array or struct as a string

Time:11-01

I am loading my data to athena from json format using org.apache.hive.hcatalog.data.JsonSerDe. Currently, I specify objects and lists with struct<...> or array<...>, but the goal is to have them in a final parquet table as varchar or string type. So that I could have it still in a JSON format but with a string/varchar type.

Could you advise some good way to do it?

Casting to json and afterwards to varchar did not work for me:

 CREATE TABLE new_table AS SELECT cast(my_struct_col as varchar) my_struct_col FROM (
    SELECT cast(my_struct_col as JSON) my_struct_col FROM raw_table
    )

CodePudding user response:

Use json_format to turn your json into varchar:

json_format(json)varchar Returns the JSON text serialized from the input JSON value. This is inverse function to json_parse():

SELECT json_format(JSON '[1, 2, 3]'); -- '[1,2,3]'
SELECT json_format(JSON '"a"'); -- '"a"'

Note that json representation for ROW in Presto is just an array -Presto does not preserve the column names info:

SELECT CAST(CAST(ROW(123, 'abc', true) 
    AS ROW(v1 BIGINT, v2 VARCHAR, v3 BOOLEAN)) AS JSON); 
-- JSON '[123,"abc",true]'

This was changed in Trino (Athena version 3 engine should be based on it):

SELECT CAST(CAST(ROW(123, 'abc', true) AS
            ROW(v1 BIGINT, v2 VARCHAR, v3 BOOLEAN)) AS JSON);
-- JSON '{"v1":123,"v2":"abc","v3":true}'
  • Related