In Athena, we're trying to get all fruits that contain seeds. The issue we're facing is that seeds true/false is nested in json and we can't seem to filter by it in the WHERE
clause. We can only see results when we SELECT
it as hasSeeds
from the nested json path, but that returns all seed results true
and false
instead of just true
-- This returns all seeds, but we want to filter in the WHERE clause by hasSeeds true
SELECT foodId, foodType, json_extract(payload,'$.food.info.seeds') AS hasSeeds
FROM "food_table"
WHERE foodType = 'fruit'
-- 3 attempts to filter by nested json value seeds true (not working)
SELECT * FROM
(SELECT foodId, foodType, json_extract(payload,'$.food.info.seeds') AS hasSeeds
FROM "food_table"
WHERE foodType = 'fruit')
WHERE hasSeeds = true
SELECT foodId, foodType, json_extract(payload,'$.food.info.seeds') AS hasSeeds
FROM "food_table"
WHERE foodType = 'fruit' and hasSeeds = true
SELECT foodId, foodType, json_extract(payload,'$.food.info.seeds') AS hasSeeds
FROM "food_table"
WHERE foodType = 'fruit' and json_extract(payload,'$.food.info.seeds') = true
Any idea how we can get the query to work as expected where we filter by hasSeeds
true on the nested json?
Update with JSON structure:
{
"foodId": 1,
"foodType": "fruit",
"payload": {
"food": {
"name": "apple",
"info": {
"seeds": true,
"calories": 95
}
}
}
}
{
"foodId": 2,
"foodType": "fruit",
"payload": {
"food": {
"name": "banana"
}
}
}
{
"foodId": 3,
"foodType": "vegetable",
"payload": {
}
}
{
"foodId": 4,
"foodType": "fruit",
"payload": {
}
}
Tried this query but it returns incorrect results:
WITH dataset(jsn) AS (
values (JSON '{"payload":{"food":{"info":{"seeds":true}}}}')
)
SELECT foodId, foodType, json_extract(payload,'$.food.info.seeds') AS hasSeeds, jsn
FROM "food_table", dataset
WHERE cast(json_extract_scalar(jsn, '$.payload.food.info.seeds') AS BOOLEAN) = true
Basically, the hasSeeds
comes back as true which is correct, but other food items, like those missing a payload's children still comeback with the jsn field, instead of actually just querying only on the WHERE
clause by something like payload.food.info.seeds
= true
Any idea how to fix it?
CodePudding user response:
Assuming you have next json structure, you need to use json_extract_scalar
:
WITH dataset(jsn) AS (
values (JSON '{"food":{"info":{"seeds":true}}}')
)
SELECT *
FROM dataset
WHERE cast(json_extract_scalar(jsn, '$.food.info.seeds') AS BOOLEAN) = true
UPD
Based on provided json WHERE
clause should look something like:
WHERE try(cast(json_extract_scalar(payload, '$.payload.food.info.seeds') AS BOOLEAN)) = true