🔍 ECS Task Not Running - Debugging Checklist
If an ECS service is deployed but no tasks are running, follow this step-by-step checklist to systematically identify and fix the issue.
1️⃣ Check ECS Service Deployment Status
Run:
aws ecs describe-services --cluster <your-cluster-name> --services <your-service-name> \
--query "services[0].{Status:status, DesiredCount:desiredCount, RunningCount:runningCount, PendingCount:pendingCount, Events:events}"
✅ Expected Output:
{
"Status": "ACTIVE",
"DesiredCount": 1,
"RunningCount": 1,
"PendingCount": 0,
"Events": []
}
🔴 If RunningCount = 0
and PendingCount = 0
, continue debugging.
🔍 If the response contains:
{
"message": "(service <service-name>) was unable to place a task because no container instance met all of its requirements. Reason: No Container Instances were found in your cluster."
}
✅ Step to check container instances:
Run:
aws ecs list-container-instances --cluster <your-cluster-name>
✅ Expected: List of registered container instances.
🔴 If empty ([]
), EC2 instances are not joining ECS.
✅ Fix: Proceed to the next steps to resolve.
2️⃣ Check If EC2 Instances Exist
Run:
aws autoscaling describe-auto-scaling-groups --auto-scaling-group-names <asg-name> \
--query "AutoScalingGroups[0].Instances[*].InstanceId"
✅ Expected: List of EC2 instance IDs.
🔴 If empty, ASG is not launching instances.
✅ Fix: Scale up ASG:
aws autoscaling update-auto-scaling-group --auto-scaling-group-name <asg-name> --desired-capacity 2
🔴 If instances exist but are not registered in ECS:
3️⃣ Ensure EC2 Instances Have Internet Access (NAT Gateway Check)
Run:
aws ec2 describe-route-tables --filters "Name=vpc-id,Values=<your-vpc-id>" --query "RouteTables[*].Routes[*].{Destination:DestinationCidrBlock, Target:NatGatewayId}"
✅ Expected: At least one route with NatGatewayId
set.
🔴 If missing, tasks cannot reach ECS API to register.
✅ Fix: Ensure NAT Gateway is correctly configured:
- Check if a NAT Gateway exists:
aws ec2 describe-nat-gateways --query "NatGateways[*].NatGatewayId"
- Ensure private subnets have a route to the NAT Gateway.
- If missing, create a NAT Gateway and add a route to private subnets.
4️⃣ Ensure EC2 Instances Join ECS Cluster
Run:
cat /etc/ecs/ecs.config
✅ Expected Output:
ECS_CLUSTER=<your-cluster-name>
🔴 If missing, manually set it:
echo "ECS_CLUSTER=<your-cluster-name>" | sudo tee -a /etc/ecs/ecs.config
sudo systemctl restart ecs
✅ Fix: Ensure the Launch Template includes user data:
user_data = base64encode(<<EOF
#!/bin/bash
echo "ECS_CLUSTER=<your-cluster-name>" >> /etc/ecs/ecs.config
EOF
)
Update Auto Scaling Group with the latest launch template:
aws autoscaling update-auto-scaling-group --auto-scaling-group-name <asg-name> --launch-template Name=<lt-name>,Version=$Latest
5️⃣ Verify ECS Agent is Running on EC2 Instances
Run:
sudo systemctl status ecs
✅ Expected: active (running)
🔴 If not running, restart ECS agent:
sudo systemctl restart ecs
✅ Fix: If ECS agent is missing, install it:
sudo yum install -y ecs-init
sudo systemctl enable ecs
sudo systemctl start ecs
6️⃣ Verify IAM Role for ECS Instances
Run:
aws iam list-attached-role-policies --role-name <ecs-instance-role>
✅ Expected: AmazonEC2ContainerServiceforEC2Role
attached.
🔴 If missing, attach manually:
aws iam attach-role-policy --role-name <ecs-instance-role> --policy-arn arn:aws:iam::aws:policy/service-role/AmazonEC2ContainerServiceforEC2Role
7️⃣ Force ECS to Re-Register Instances
Run:
aws ecs update-container-instances-state --cluster <your-cluster-name> --container-instances <instance-id> --status ACTIVE
✅ Expected: Instances move to ACTIVE
state.
🔴 If instances are missing, force new deployment:
aws ecs update-service --cluster <your-cluster-name> --service <your-service-name> --force-new-deployment
Final Steps
- Check service status:
aws ecs describe-services --cluster <your-cluster-name> --services <your-service-name>
- Check registered EC2 instances:
aws ecs list-container-instances --cluster <your-cluster-name>
- Describe failing task logs:
aws logs describe-log-streams --log-group-name "/ecs/<your-service-name>"
Once instances are properly registered, ECS should start running tasks. 🚀
Top comments (0)