Within my workflow I query DynamoDB for tables whose load_fail
status equals 1
.
If there is at least one table, Glue job needs to start with that list of tables as --source_tables
argument.
Below is my entire state machine.
{
"Comment": "A description of my state machine",
"StartAt": "Query",
"States": {
"Query": {
"Type": "Task",
"Next": "Choice",
"Parameters": {
"TableName": "source_tables_load_status",
"KeyConditionExpression": "load_fail = :load_fail",
"ExpressionAttributeValues": {
":load_fail": {
"S": "1"
}
}
},
"Resource": "arn:aws:states:::aws-sdk:dynamodb:query",
"ResultSelector": {
"count.$": "$.Count",
"startTime.$": "$$.Execution.StartTime",
"items.$": "$.Items[*].table_name.S"
}
},
"Choice": {
"Type": "Choice",
"Choices": [
{
"Variable": "$.count",
"NumericGreaterThanEquals": 1,
"Next": "start_glue"
}
],
"Default": "Success"
},
"start_glue": {
"Type": "Task",
"Resource": "arn:aws:states:::glue:startJobRun",
"Parameters": {
"JobName": "data-moving-glue",
"Arguments": {
"--dynamodb_metadata_table": "metadata_table",
"--source_tables.$": "$.items"
}
},
"End": true
},
"Success": {
"Type": "Succeed"
}
}
}
Currently I'm getting an error caused by "--source_tables.$": "$.items"
.
Question is how to make "--source_tables":["dbo.Table_Two", "dbo.Table_Three"]
working by state machine:
An error occurred while executing the state 'start_glue' (entered at the event id #9).
The Parameters '{"JobName":"data-moving-glue","Arguments":{"--dynamodb_metadata_table":"metadata_table","--source_tables":["dbo.Table_Two", "dbo.Table_Three"]}}'
could not be used to start the Task: [The value for the field '--source_tables' must be a STRING]
CodePudding user response:
I closed the result in quotes making it into a string using States.Format
https://docs.aws.amazon.com/step-functions/latest/dg/amazon-states-language-intrinsic-functions.html
"--source_tables.$": "States.Format('{}', $.items)"
New output is:
"--source_tables": "[\"dbo.TableOne\",\"dbo.TableTwo\"]"
This on the other hand can be handled with a function.
eval
is used only as an example! Don't use it as it can compromise your code!
lst = "[\"dbo.TableOne\",\"dbo.TableTwo\"]"
for t in (eval(lst)):
print(t)