Home > OS >  Step function hanging on glue crawler step
Step function hanging on glue crawler step

Time:11-22

Hi there I am trying to orchestrate a glue crawler to run in my step function with the "Wait for callback" option enabled. However, the step function hangs at this step even though the crawler is complete, and does not proceed to the next step.

Here is my step function's json definition file

{
  "Comment": "A description of my state machine",
  "StartAt": "StartCrawler",
  "States": {
    "StartCrawler": {
      "Type": "Task",
      "Next": "Glue StartJobRun",
      "Parameters": {
        "Name": "farmlands-raw-crawler-sam"
      },
      "Resource": "arn:aws:states:::aws-sdk:glue:startCrawler.waitForTaskToken"
    },
    "Glue StartJobRun": {
      "Type": "Task",
      "Resource": "arn:aws:states:::glue:startJobRun.sync",
      "Parameters": {
        "JobName": "fetch-from-azure-etl-sam"
      },
      "End": true
    }
  }
}

And here is my step function role that i am using (deployed using sam)

  # Role used to invoke step functions
  StepFunctionRole:
    Type: AWS::IAM::Role
    Properties:
      RoleName: !Sub "StepFunctionRole"
      Description: "IAM role for step functions"
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Principal:
              Service:
                - states.amazonaws.com
                - lambda.amazonaws.com
            Action:
              - sts:AssumeRole
      Policies:
        - PolicyName: root
          PolicyDocument:
            Version: "2012-10-17"
            Statement:
              - Effect: Allow
                Action:
                  - "lambda:InvokeFunction"
                  - "sns:Publish"
                  - "events:PutTargets"
                  - "events:PutRule"
                  - "events:DescribeRule"
                  - "iam:GetRole"
                  - "iam:PassRole"
                  - "states:*"
                  - "databrew:*"
                  - "athena:*"
                  - "glue:*"
                  - "logs:*"
                  - "s3:*"
                  - "sagemaker:*"
                Resource:
                  - "*"

Notice how i give all glue permissions "glue:*", so i'm not sure why the step function hangs at the crawler step. I've also followed the exact same steps here AWS Step Function is stuck at a state but the OP had issues with glue jobs instead of glue crawlers

CodePudding user response:

The callback does not provide the Crawler's status. You need to implement this functionality which can poll the Crawler's status and move forward in step function once it is in "SUCCEEDED" state after running.

You can read about how to do that here.

  • Related