Fargate fails to pull ECR image with an error:
CannotPullContainerError: API error (500): Get https://xxxxxxx.dkr.ecr.us-east-1.amazonaws.com/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
I have followed recommendations described here:
https://github.com/aws/amazon-ecs-agent/issues/1128#issuecomment-351545461
Configuration summary:
VPC: 10.0.0.0/16
Subnet: 10.0.1.0/24
NAT gateway with a public IP: nat-id
Subnet's routing entries:
10.0.0.0/16 | local
0.0.0.0/0 | nat-id
Security group outbound: ALL Traffic ALL ALL 0.0.0.0/0
Auto-assign public IP DISABLED
As others have pointed out here https://github.com/aws/amazon-ecs-agent/issues/1128#issuecomment-352090244 and here https://github.com/aws/amazon-ecs-agent/issues/1128#issuecomment-354883589, just setting up NAT as described here https://github.com/aws/amazon-ecs-agent/issues/1128#issuecomment-352090244 is not sufficient. There really seams to be something wrong with the access. Same exact repository image starts successfully when deployed with a public IP.
Please let me know if you need more details to reproduce.
I would attempt to debug this by creating an EC2 instance to the subnet and seeing if docker pull works. The EC2 instance should not have a public IP for testing purposes. Is your subnet private or public? I believe with Farget you should have a private subnet and a public subnet and deploy the task to the private subnet, then using NAT+IGW for public internet access.
@panuhorsmalahti thanks for the tips. This single subnet is in a private range. Access to the service from outside is currently not a concern - I need to get it running first.
What does "in private range" mean? Is the subnet private or public? The recommended setup seems to be using a private subnet, a public subnet, a NAT Gateway and an Internet gateway. I got my setup working with that. I launched the task into the private subnet.
See AWS documentation:
"If a subnet's traffic is routed to an internet gateway, the subnet is known as a public subnet."
"If a subnet doesn't have a route to the internet gateway, the subnet is known as a private."
https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Subnets.html
Also see:
"Tasks launched within public subnets do not have outbound network access."
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-networking.html
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/create-public-private-vpc.html
As described above, I am using NAT, not an IGW, the subnet is private. Do you use ECR in your setup?
Yes.
Anyway, I suggest the following: "I would attempt to debug this by creating an EC2 instance to the subnet and seeing if docker pull works"
@panuhorsmalahti thanks a lot for providing the relevant information. I did not realize before that one has to setup both private AND public subnets. Just assigning NAT to a private subnet is not enough. It is required to have another subnet in the same VPC which is forwarding it's 0.0.0.0/0 to the IGW.
I'm experiencing this issue as well. Can you clarify what you mean that you needed both a private and public subnet?
@dovidkopel there's not much to clarify there. https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Scenario2.html
@afedulov answer is spot on! The way it works is like this:
Private Subnet routes 0.0.0.0/0 to NAT Gateway.
NAT Gateway is attached to a Public Subnet which routes 0.0.0.0/0 to Internet Gateway.
All on the same VPC.
Most helpful comment
@panuhorsmalahti thanks a lot for providing the relevant information. I did not realize before that one has to setup both private AND public subnets. Just assigning NAT to a private subnet is not enough. It is required to have another subnet in the same VPC which is forwarding it's 0.0.0.0/0 to the IGW.