I'm new to Presto SQL and I am stuck to a issue and I really need your help.
I have a table with a field "Info":
Date Info
2021-01-01 {"user_id":"12345", "order_id": "[333222, 444555, 777888]"}
2021-01-02 {"user_id":"67891", "order_id": "[948263]"}
2021-01-03 {"user_id":"93846", "order_id": "[937846, 889213, 886534, 991827]"}
2021-01-04 {"user_id":"63792", "order_id": "[]"}
2021-01-05 {"user_id":"04937", "order_id": "[837462, 837213]"}
I want to pull all order_id(s) and convert them as rows, the output may look like this:
Date order_id
2021-01-01 333222
2021-01-01 444555
2021-01-01 777888
2021-01-02 948263
2021-01-03 937846
...
2021-01-05 837213
My query:
SELECT Date, Info['ORDER_ID']
FROM test_table
I tried the query above but it can only returns the order_id as an array. Anyone know how to do it? Thank you so much and I appreciate your help!
CodePudding user response:
Documentation is you friend. You are interested in json functions and unnest
(in the code sample succinct version of syntax is used). In addition to this there is one trick - "[333222, 444555, 777888]"
is a double encoded json array, so you need to parse it two times (first one is done by json_extract_scalar
):
-- sample data
WITH dataset (Date, Info) AS (
VALUES ('2021-01-01', '{"user_id":"12345", "order_id": "[333222, 444555, 777888]"}')
)
-- query
select Date, order_id
from dataset,
unnest (cast(json_parse(json_extract_scalar(Info, '$.order_id')) as array(integer))) as t(order_id);
Output:
Date | order_id |
---|---|
2021-01-01 | 333222 |
2021-01-01 | 444555 |
2021-01-01 | 777888 |