I have a table in presto with a column of JSON data and the json having 2 array entries:
Array "key-array":
[AAA, BBB, CCC]
Array "value-array":
[123, 456, 789]
The table would be of this format:
json-column
{"key-array":"[\"AAA\",\"BBB\",\"CCC\"]", "value-array":"[\"123\",\"456\",\"789\"]","str_1":"abc"}
I would like to perform the following operations using presto query:
Search for a string inside array/column key-array and if it exists get its index.
Once I have the index, query array/column value-array and fetch the entry corresponding to the index.
What would the presto query be to print data in the following format:
Key Value
AAA 123
BBB 456
CCC 789
CodePudding user response:
You can just process the json and flatten it with unnest
:
-- sample data
WITH dataset(json_column) AS (
VALUES
('{"key-array":["AAA","BBB","CCC"], "value-array":["123","456","789"]}')
)
-- query
select key, value
from (
select cast(json_parse(json_column) as map(varchar, array(varchar))) parsed
from dataset
) jt,
unnest(parsed['key-array'], parsed['value-array']) as t(key, value)
Output:
key | value |
---|---|
AAA | 123 |
BBB | 456 |
CCC | 789 |
P.S. casting to map is used cause I was not able to figure out json path escaping for presto, otherwise cast to map and parsed[...]
would be replaced with json_extract
and cast to array(varchar)
.
UPD
Missed that the array values in json are double encoded, so they actually are json strings, not json arrays, so you need to parse them again:
-- sample data
WITH dataset(json_column) AS (
VALUES
('{"key-array":"[\"AAA\",\"BBB\",\"CCC\"]", "value-array":"[\"123\",\"456\",\"789\"]"}')
)
-- query
select key,
value
from (
select cast(json_parse(json_column) as map(varchar, json)) parsed
from dataset
) jt,
unnest(
cast(json_parse(cast(parsed['key-array'] as varchar)) as array(varchar)),
cast(json_parse(cast(parsed['value-array'] as varchar)) as array(varchar))
) as t(key, value)