splitting a dict-like varchar column into multiple columns using SQL presto-CodePudding

I have a column in a table that is varchar but has a dictionary-like format. Some rows have more key-value pairs (for example first row has 3 pairs and second row has 4 pairs). For example:

column
{"customerid":"12345","name":"John", "likes":"Football, Running"}
{"customerid":"54321","name":"Sam", "likes":"Art", "dislikes":"Hiking"}

I need a query that can "explode" the column like so:

customerid	name	likes	dislikes
12345	John	Football, Running
54321	Sam	Art	Hiking

No extra rows are added. Just extra columns (There are other already existing columns in the table).

I've tried casting the varchar column to an array and then using UNNEST function but it doesn't work. I think that method creates extra rows.

I am using Prestosql.

CodePudding user response：

Your data looks like json, so you can parse and process it:

-- sample data
WITH dataset (column) AS (
    VALUES ('{"customerid":"12345","name":"John", "likes":"Football, Running"}' ),
        ('{"customerid":"54321","name":"Sam", "likes":"Art", "dislikes":"Hiking"}')
) 

--query
select json_extract_scalar(json_parse(column), '$.customerid') customerid,
    json_extract_scalar(json_parse(column), '$.name') name,
    json_extract_scalar(json_parse(column), '$.likes') likes,
    json_extract_scalar(json_parse(column), '$.dislikes') dislikes
from dataset

Output:

customerid	name	likes	dislikes
12345	John	Football, Running
54321	Sam	Art	Hiking

In case of many columns you can prettify it by casting to parsed json to map (depended on contents it can be map(varchar, varchar) or map(varchar, json)):

--query
select m['customerid'] customerid,
    m['name'] name,
    m['likes'] likes,
    m['dislikes'] dislikes
from (
    select cast(json_parse(column) as map(varchar, varchar)) m
    from dataset
)