I have a sample of my dataset which looks like this:
WITH raw_string AS (
SELECT CAST
(
'{
"1261369621021236":["It was interactive and the guy was v nice"]
,"4806456876149982":["Mediahub"]
,"501699158168566":["Yes"]
,"293482916246997":["Do it IRL"]
,"1748555938809400":["Europe, Middle East, or Africa (EMEA)"]
,"3019471278301443":["Storytelling for Impact"]
,"1051072995806230":["Yes"],"687313266038215":["28042022"]
,"406048341357670":["Somewhat agree"]
,"304353711772145":["Very satisfied"]
}'
AS JSON
) AS answers_text
)
SELECT *,
JSON_EXTRACT_SCALAR(answers_text, '$["1261369621021236"]') AS q1 FROM raw_string
Why is this returning null? I'm trying to get a value, which in this case should be: "It was interactive and the guy was v nice"
CodePudding user response:
Because answer_text is not an array but just cast as an object with attributes remove the parenthesis and try it as follows:
SELECT *,
JSON_EXTRACT_SCALAR(answers_text, '$.1261369621021236') AS q1 FROM raw_string
look at the following 2 samples I used to come to the above conclusion
SELECT JSON_EXTRACT_SCALAR(JSON '{ "name" : "Jakob", "age" : "6" }', '$.age') AS scalar_age;
SELECT json_extract_scalar('[1, 2, 3]', '$[2]');
I could not run this against presto. My only concern is if the attribute name would be confused for big integer.I also question why you have brackets around the values. It should not be necessary for normal json
CodePudding user response:
From json_extract_scalar
docs:
The value referenced by
json_path
must be a scalar (boolean, number or string):
But you have an json array which is not a scalar value, so you need to either provide path to scalar value - '$.1261369621021236[0]'
:
-- data
WITH raw_string AS (
SELECT json '{"1261369621021236":["It was interactive and the guy was v nice"],
"4806456876149982":["Mediahub"]}' AS answers_text
)
-- query
SELECT JSON_EXTRACT_SCALAR(answers_text, '$.1261369621021236[0]') AS q1
FROM raw_string
Output:
q1 |
---|
It was interactive and the guy was v nice |
Or use JSON_EXTRACT
:
-- query
SELECT JSON_EXTRACT(answers_text, '$.1261369621021236') AS q1
FROM raw_string
Output:
q1 |
---|
["It was interactive and the guy was v nice"] |
Also there are some options using casting to different types. For example provided json can be casted to map(varchar, array(varchar))
and values can be extracted by key and array index (note that in presto the start from 1):
SELECT cast(answers_text as map(varchar, array(varchar)))['1261369621021236'][1] AS q1
FROM raw_string