Home > Mobile >  transform nested json in Athena to show in Quicksight
transform nested json in Athena to show in Quicksight

Time:11-23

I have written below Athena query to show/visualize data in Quicksight. This work fine as long as i have one record in table. If you see the query i have hardcoded as index 1. How i can modify below query to include all the data from the table. If remove index 1 then it gives me error as

SYNTAX_ERROR: line 8:19: Expression result.extensions.response is not of type ROW

Query:

select user_id, assessment_id, created_by,result.extensions.response[1].assessmentid AS AssesmentId,
result.extensions.response[1].assessmentname AS AssesmentName,
response_json.questionid AS QuestionId,
response_json.questionText as Questiontext,
transform(response_json.answers,answer-> answer.answerId) AS AnswerID,
transform(response_json.answers,answer-> answer.answerText) AS AnswerText
FROM focalbucket
CROSS JOIN UNNEST(result.extensions.response[1].responseData) AS t(response_json)

Output: enter image description here

Athena Table DDL:

CREATE EXTERNAL TABLE `focalbucket`(
  `assessment_id` int COMMENT 'from deserializer', 
  `id` string COMMENT 'from deserializer', 
  `user_id` string COMMENT 'from deserializer', 
  `project_id` int COMMENT 'from deserializer', 
  `created_by` string COMMENT 'from deserializer', 
  `team_id` int COMMENT 'from deserializer', 
  `result`     struct<extensions:struct<response:array<struct<assessmentid:int,assessmentname:string,assessmentcreateddate:string,   responsedata:array<struct<questionid:int,answers:array<struct<answerid:int,answertext:string>>,
   questiontext:string,questiontype:string>>,userfullname:string>>>,
   completion:boolean,platform:string,app_version:string> COMMENT 'from deserializer', 
  `verb` struct<id:string> COMMENT 'from deserializer', 
  `actor` struct<mbox:string,name:string> COMMENT 'from deserializer', 
  `timestamp` bigint COMMENT 'from deserializer', 
  `version` string COMMENT 'from deserializer')
ROW FORMAT SERDE 
  'org.openx.data.jsonserde.JsonSerDe'

CodePudding user response:

result.extensions.response is an array also so you can unnest it too. Something like this (note - not tested since no example data was provided):

select 
   user_id, 
   assessment_id, 
   created_by,
   t.response.assessmentid AS AssesmentId,
   t.response.assessmentname AS AssesmentName,
   response_json.questionid AS QuestionId,
   response_json.questionText as Questiontext,
   transform(response_json.answers,answer-> answer.answerId) AS AnswerID,
   transform(response_json.answers,answer-> answer.answerText) AS AnswerText
FROM focalbucket
CROSS JOIN UNNEST(result.extensions.response) AS t(response)
CROSS JOIN UNNEST(t.response.responseData) AS tt(response_json)
  • Related