Setup
Security Groups
ALB (inbound rules)
HTTPS:443 from 0.0.0.0/0 & ::/0
HTTP:80 from 0.0.0.0/0 & ::/0
Cluster (inbound rules)
- All traffic from ALB security group
Cluster
- instance is t2.micro (only running 1 instance in subnets
us-east-1<a,b,c>
under default VPC with public IP enabled) - client →
0.375 vCPU/0.25 GB, 1 task, bridge network, 0:3000 (host:container)
- server →
0.25 vCPU/0.25 GB, 2 tasks, bridge network, 0:5000 (host:container)
ALB
- availability zones:
us-east-1<a,b,c>
, same default VPC - listeners:
HTTP:80 → redirect to HTTPS://#{host}:443/#{path}?#{query}
HTTPS:443 (/) → forward to client target group
HTTPS:443 (/api) → forward to server target group
Target Groups
- client → HTTP:3000 with default health check of
HTTP, /, Traffic Port, 5 healthy, 2 unhealthy, 5s timeout, 30s interval, 200 OK
- server → HTTP:5000 with health check of
HTTP, /api/health, Traffic Port, 5 healthy, 2 unhealthy, 5s timeout, 30s interval, 200 OK
Both docker images for client and server work properly locally & the client service seems to work well in AWS ECS. However, the server service keeps cycling between registering and de-registering (draining) the container instances seemingly without even becoming unhealthy
Here is what I see in the service Deployments and events
tab:
5/12/2022, 8:43:04 PM service server registered 2 targets in target-group <...>
5/12/2022, 8:42:54 PM service server has started 2 tasks: task <...> task <...>. <...>
5/12/2022, 8:42:51 PM service server deregistered 1 targets in target-group <...>
5/12/2022, 8:42:51 PM service server has begun draining connections on 1 tasks. <...>
5/12/2022, 8:42:51 PM service server deregistered 1 targets in target-group <...>
5/12/2022, 8:42:17 PM service server registered 2 targets in target-group <...>
5/12/2022, 8:42:07 PM service server has started 2 tasks: task <...> task <...>. <...>
5/12/2022, 8:42:04 PM service server deregistered 1 targets in target-group <...>
5/12/2022, 8:42:04 PM service server has begun draining connections on 1 tasks. <...>
5/12/2022, 8:42:04 PM service server deregistered 1 targets in target-group <...>
Any ideas?
CodePudding user response:
Have you added your ALB SG as a source to the SG attached to your containerized application?
CodePudding user response:
After enabling AWS CloudWatch logs in my task definition's container specs, I was able to see that the issue was actually with an AWS RDS instance.
The RDS instances' SG was accepting traffic from an old cluster SG (which no longer exists), so that clears up why a health check wasn't being performed and the registered instances were draining immediately.