Home > database >  Nginx container on ECS not following rolling update
Nginx container on ECS not following rolling update

Time:01-02

I'm so close to finding a nice setup with Docker Compose and ECS, but there is one small thing remaining.

The scenario goes like this:

  1. Update app (Django) source code and deploy to ECS using Docker Compose and Docker Context.
  2. ECS registers a new task for the app and starts it along with the old one.
  3. Problem: Nginx does health checks on the old container and when that is deregistered, nginx starts throwing 502 errors and restarts the task, leading to downtime and unavailability.
  4. Nginx starts up again and does health checks on the new container, app working again, but with undesired downtime as mentioned.

Is there some config I need to do here? Am I missing something?

docker-compose.yml for reference:

services:
  web:
    image: # Image from ECR, built from GH-action.
    command: gunicorn core.wsgi:application --bind 0.0.0.0:8000
    environment:
      # ...
    volumes:
      # ...
    deploy:
      replicas: 1

  nginx:
    image: # Image from ECR, kept static
    ports:
      - "80:80"
    volumes:
      # ...
    depends_on:
      - web
    deploy:
      replicas: 1

CodePudding user response:

So it turns out this was quite the challenge. The bottom line is that an ECS task cannot be updated while it's running. So we need to restart the task or use the execute-command to manually update it.

I tried the jwilder/nginx-proxy approach, but this was not possible with Fargate because of the way volume mounting works with that launch type.

I ended up using the sidecar pattern for my nginx container, however, there is currently no solution available for sidecars with the Compose-CLI (see https://github.com/docker/compose-cli/issues/1566), so I had to use the x-aws-cloudformation overlay in a slightly messy way.

So first we just remove the nginx service:

version: "3.9"

services:
  web:
    image: # django-app
    command: gunicorn core.wsgi:application --bind 0.0.0.0:8000
    environment:
      # ...
    volumes:
      # ...
    ports:
      - "80:80" # Move ports into this service so we get the ALB 
    deploy:
      replicas: 1

Run convert command to get the generated CloudFormation template:

docker compose covert > cfn.yml

Then add the x-aws-cloudformation overlay:

x-aws-cloudformation:
  Resources:
    WebTaskDefinition:
      Properties:
        ContainerDefinitions:
          # Generated container definitions, copy/paste from cfn.yml
          # Only change ContainerPort for web:
          - # ...Web_ResolvConf_InitContainer
          - # ...Web-container
            PortMappings:
              - ContainerPort: 8000

          # The nginx sidecar:
          - DependsOn:
              - Condition: SUCCESS
                ContainerName: Web_ResolvConf_InitContainer
              - Condition: START
                ContainerName: web
            Essential: true
            Image: # nginx
            LogConfiguration:
              # ...
            MountPoints:
              # ...
            Name: nginx
            PortMappings:
              - ContainerPort: 80
    
    # We also need to tell the load-balancer to reference the nginx container
    WebService:
      Properties:
        LoadBalancers:
          - ContainerName: nginx
            ContainerPort: 80
            TargetGroupArn:
              Ref: WebTCP80TargetGroup

Finally, we need to change the nginx config a bit

# BEFORE
upstream app_server {
    server web:8000 fail_timeout=0;
}

# AFTER
upstream app_server {
    server 0.0.0.0:8000 fail_timeout=0;
}

Not pretty, but it works. Rolling updates will now have zero downtime as expected. Hopefully, this pattern will be improved with the evolution of the Compose-CLI!

  • Related