Home > other >  Is there a way to find the average runtime of an ECS scheduled task on AWS?
Is there a way to find the average runtime of an ECS scheduled task on AWS?

Time:03-11

I have a scheduled task running on ECS with a launch type of Fargate. Is there a way to find the average runtime of the task over the past X amount of time?

CodePudding user response:

As you're no doubt already aware, ECS CloudWatch metrics are limited to cluster and service. I was thinking that there might be task-level metrics for AWS Batch, and was going to recommend using that, but it seems there aren't any metrics for it.

So that means generating the metrics yourself. I think that the best approach will be to implement a Lambda that is triggered by an EventBridge event.

As it happens, I just did this (not for generating metrics, sorry), so can outline the basic steps.

The first one is that you need to set up the EventBridge rule. Here's the rule that I used, which triggered the Lambda whenever a specific task definition was completed:

{
  "detail-type": ["ECS Task State Change"],
  "source": ["aws.ecs"],
  "detail": {
    "taskDefinitionArn": [{
      "prefix": "arn:aws:ecs:us-east-1:123456789012:task-definition/HelloWorld:"
    }],
    "lastStatus": ["STOPPED"]
  }
}

This looks for a specific task definition. If you delete HelloWorld: it will trigger on every task definition.

Your Lambda will be invoked with the following event (I've removed everything that isn't relevant to this answer):

{
    "version": "0",
    "id": "2e08a760-c304-9681-9509-e6c9ca88ee36",
    "detail-type": "ECS Task State Change",
    "source": "aws.ecs",
    "detail": {
        "createdAt": "2022-03-10T15:18:57.782Z",
        "pullStartedAt": "2022-03-10T15:19:10.488Z",
        "pullStoppedAt": "2022-03-10T15:19:26.541Z",
        "startedAt": "2022-03-10T15:19:26.846Z",
        "stoppingAt": "2022-03-10T15:19:36.946Z",
        "stoppedAt": "2022-03-10T15:19:50.213Z",
        "stoppedReason": "Essential container in task exited",
        "stopCode": "EssentialContainerExited",
        "taskDefinitionArn": "arn:aws:ecs:us-east-1:123456789012:task-definition/HelloWorld:1",
    }
}

So, you've got a bunch of ISO-8601 timestamps, which should be easy to translate into elapsed time values for whatever metrics you want to track. And you've got the task definition ARN, from which you can extract the task definition name. From that you can call the PutMetricData API call. I also left stopCode in the example, because you might want to track successful versus non-successful executions.

I recommend that you name the metric after the elapsed time value (eg, RunTime, TotalTime), and dimension by the task definition name.

  • Related