Hi I am trying to run the following command in Athena
CREATE EXTERNAL TABLE transport_evaluator_prod(
messageId STRING,
type STRING,
causationId STRING,
correlationId STRING,
traceparent STRING,
`data` struct <
evaluationOccurred: STRING,
eta struct < distance: INT,
timeToDestination: INT,
eta: STRING,
destination struct < latitude: DOUBLE,
longitude: DOUBLE,
altitude: DOUBLE >,
destinationEventId: STRING,
origin struct < latitude: DOUBLE,
longitude: DOUBLE,
altitude: DOUBLE >, originEventId: STRING,
plannedArrival: STRING,
locationActionReference: STRING,
resourceUrn: STRING,
eventProvider: STRING,
occured: STRING,
position struct < latitude: DOUBLE,
longitude: DOUBLE,
altitude: DOUBLE >,
equipmentNumber: STRING,
received: STRING > >
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES (
'serialization.format' = '1',
'ignore.malformed.json' = 'true'
)
LOCATION 'changed-for-security'
TBLPROPERTIES ('has_encrypted_data' = 'false')
Unfortunately when I try to run this I get the following error : FAILED: ParseException line 1:189 missing : at 'struct' near '' line 1:262 missing : at 'struct' near '' line 1:363 missing : at 'struct' near '' line 1:579 missing : at 'struct' near ''
Can someone please help?
CodePudding user response:
In reading Create Tables in Amazon Athena from Nested JSON and Mappings Using JSONSerDe | AWS Big Data Blog, I notice that any fields inside a STRUCT
should be referenced as field_name:type
.
This also applies to struct
.
Therefore, this type of line (which is inside a struct
):
destination struct < latitude: DOUBLE,
should be:
destination:struct < latitude: DOUBLE,
Thus, this seems to work:
CREATE EXTERNAL TABLE transport_evaluator_prod(
messageId STRING,
type STRING,
causationId STRING,
correlationId STRING,
traceparent STRING,
`data` struct <
evaluationOccurred: STRING,
eta:struct < distance: INT,
timeToDestination: INT,
eta: STRING,
destination:struct < latitude: DOUBLE,
longitude: DOUBLE,
altitude: DOUBLE >,
destinationEventId: STRING,
origin:struct < latitude: DOUBLE,
longitude: DOUBLE,
altitude: DOUBLE >,
originEventId: STRING,
plannedArrival: STRING,
locationActionReference: STRING,
resourceUrn: STRING,
eventProvider: STRING,
occured: STRING,
position:struct < latitude: DOUBLE,
longitude: DOUBLE,
altitude: DOUBLE >,
equipmentNumber: STRING,
received: STRING > >
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES (
'serialization.format' = '1',
'ignore.malformed.json' = 'true'
)
LOCATION 'changed-for-security'
TBLPROPERTIES ('has_encrypted_data' = 'false')