Hi there I am trying to orchestrate a glue crawler to run in my step function with the "Wait for callback" option enabled. However, the step function hangs at this step even though the crawler is complete, and does not proceed to the next step.
Here is my step function's json definition file
{
"Comment": "A description of my state machine",
"StartAt": "StartCrawler",
"States": {
"StartCrawler": {
"Type": "Task",
"Next": "Glue StartJobRun",
"Parameters": {
"Name": "farmlands-raw-crawler-sam"
},
"Resource": "arn:aws:states:::aws-sdk:glue:startCrawler.waitForTaskToken"
},
"Glue StartJobRun": {
"Type": "Task",
"Resource": "arn:aws:states:::glue:startJobRun.sync",
"Parameters": {
"JobName": "fetch-from-azure-etl-sam"
},
"End": true
}
}
}
And here is my step function role that i am using (deployed using sam)
# Role used to invoke step functions
StepFunctionRole:
Type: AWS::IAM::Role
Properties:
RoleName: !Sub "StepFunctionRole"
Description: "IAM role for step functions"
AssumeRolePolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: Allow
Principal:
Service:
- states.amazonaws.com
- lambda.amazonaws.com
Action:
- sts:AssumeRole
Policies:
- PolicyName: root
PolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: Allow
Action:
- "lambda:InvokeFunction"
- "sns:Publish"
- "events:PutTargets"
- "events:PutRule"
- "events:DescribeRule"
- "iam:GetRole"
- "iam:PassRole"
- "states:*"
- "databrew:*"
- "athena:*"
- "glue:*"
- "logs:*"
- "s3:*"
- "sagemaker:*"
Resource:
- "*"
Notice how i give all glue permissions "glue:*", so i'm not sure why the step function hangs at the crawler step. I've also followed the exact same steps here AWS Step Function is stuck at a state but the OP had issues with glue jobs instead of glue crawlers
CodePudding user response:
The callback does not provide the Crawler's status. You need to implement this functionality which can poll the Crawler's status and move forward in step function once it is in "SUCCEEDED" state after running.
You can read about how to do that here.