In my table I've a column, which is a varchar but has a nested dictionary-like format (three nested levels). Some entries have multiple key-value pairs (customer ID & name), while some just have a single entry (customer ID). For example:
column
{
"customer_type1": {
"location1": {"customerid":"12345","name":"John"},
"location2": {"customerid":"12346","name":"Conor"},
}
"customer_type2": {
"location3": {"customerid":"12347","name":"Brian"},
"location4": {"customerid":"12348"},
}
}
I need a query that will break out the the column into a table like this:
customer_type Location Customer_ID Name
customer_type1 location1 12345 John
customer_type1 location2 12346 Conor
customer_type2 location3 12347 Brian
customer_type2 location4 12348
I know a solution to extract for a single nested key-value pair, but cant edit it to work for a nested dictionary like this. I'm using Prestosql.
--query for single nested key-value pair
select json_extract_scalar(json_parse(column), '$.customer_id') customerid,
json_extract_scalar(json_parse(column), '$.name') name
from dataset
CodePudding user response:
Common approach with this kind of dynamic json is to cast it to MAP
, in this case nested map - MAP(VARCHAR, MAP(VARCHAR, JSON))
and use unnest
to flatten the result:
-- sample data
WITH dataset (json_str) AS (
VALUES (
'{
"customer_type1": {
"location1": {"customerid":"12345","name":"John"},
"location2": {"customerid":"12346","name":"Conor"}
},
"customer_type2": {
"location3": {"customerid":"12347","name":"Brian"},
"location4": {"customerid":"12348"}
}
}'
)
)
-- query
select customer_type,
location,
json_extract_scalar(cust_json, '$.customerid') customer_id,
json_extract_scalar(cust_json, '$.name') name
from (
select cast(json_parse(json_str) as map(varchar, map(varchar, json))) as maps
from dataset
)
cross join unnest(maps) as t(customer_type, location_map)
cross join unnest(location_map) as t(location, cust_json)
Output:
customer_type | location | customer_id | name |
---|---|---|---|
customer_type1 | location1 | 12345 | John |
customer_type1 | location2 | 12346 | Conor |
customer_type2 | location3 | 12347 | Brian |
customer_type2 | location4 | 12348 |