How to create step which will be repeated if lambda function will return "failed" status.
job2 = job1.next(Choice(self, "Check status.").when(Condition.number_equals("$.statusCode", 200), job3).otherwise(job2))
CodePudding user response:
You cannot internally to just Step Function... or more accurately, there may be a way but if you do go down that rather complex and overly complicated path you are going to set yourself to be charged 1000s of dollars as a failed lambda causes the system to rerun continuously and charging you every time.
You can do something with a Dead Letter Queue or a Catch in your Step Function definition to add an even too an SQS queue and have that fire every so often to re start failed jobs. But again. Be careful. It would be very easy to end up charging yourself/your company thousand of dollars very easily
CodePudding user response:
The Choice
state should have 3 branches:
- if
job1
succeeded, go tojob2
- if
job1
failed AND a retry count has not been reached, loopincrementTask -> job1
- otherwise go to the
Fail
state
checkStatus = (
sfn.Choice(self, "CheckStatus")
.when(sfn.Condition.number_equals("$.job1.statusCode", 200), job2)
.when(
sfn.Condition.and_(
sfn.Condition.not_(sfn.Condition.number_equals("$.job1.statusCode", 200)),
sfn.Condition.or_(
sfn.Condition.is_not_present("$.retryCount"),
sfn.Condition.number_less_than("$.retryCount", 3),
),
),
incrementTask.next(job1),
)
.otherwise(fail)
)
Add a Lambda Task that increments the retry count before looping back to job1
. Make sure to set a result_path
on job1
so that it does not overwrite the retryCount
.
# incrementTask handler
def handler(event, context):
event["retryCount"] = event.get("retryCount", 0) 1
return event
The state machine will look like this: