Home > OS >  Get the index of an array entry using presto
Get the index of an array entry using presto

Time:07-09

I have a table in presto with a column of JSON data and the json having 2 array entries:

Array "key-array":

[AAA, BBB, CCC]

Array "value-array":

[123, 456, 789]

The table would be of this format:

json-column
     
{"key-array":"[\"AAA\",\"BBB\",\"CCC\"]", "value-array":"[\"123\",\"456\",\"789\"]","str_1":"abc"}

I would like to perform the following operations using presto query:

  1. Search for a string inside array/column key-array and if it exists get its index.

  2. Once I have the index, query array/column value-array and fetch the entry corresponding to the index.

What would the presto query be to print data in the following format:

Key  Value
AAA  123
BBB  456
CCC  789

CodePudding user response:

You can just process the json and flatten it with unnest:

-- sample data
WITH dataset(json_column) AS (
    VALUES 
        ('{"key-array":["AAA","BBB","CCC"], "value-array":["123","456","789"]}')
) 

-- query
select key, value
from (
    select cast(json_parse(json_column) as map(varchar, array(varchar))) parsed
    from dataset
) jt,
unnest(parsed['key-array'], parsed['value-array']) as t(key, value)

Output:

key value
AAA 123
BBB 456
CCC 789

P.S. casting to map is used cause I was not able to figure out json path escaping for presto, otherwise cast to map and parsed[...] would be replaced with json_extract and cast to array(varchar).

UPD

Missed that the array values in json are double encoded, so they actually are json strings, not json arrays, so you need to parse them again:

-- sample data
WITH dataset(json_column) AS (
    VALUES 
        ('{"key-array":"[\"AAA\",\"BBB\",\"CCC\"]", "value-array":"[\"123\",\"456\",\"789\"]"}')
) 

-- query
select key, 
    value
from ( 
    select cast(json_parse(json_column) as map(varchar, json)) parsed 
    from dataset 
) jt, 
unnest(
        cast(json_parse(cast(parsed['key-array'] as varchar)) as array(varchar)), 
        cast(json_parse(cast(parsed['value-array'] as varchar)) as array(varchar))
    ) as t(key, value)
  • Related