I tried many times, and even if an image is on the ECS registry, I get the following error:
CannotPullContainerError: API error (500): Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
or
CannotPullContainerError: API error (500): Get https://XXX.dkr.ecr.us-east-1.amazonaws.com/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
I am using images on ECS registry with fargate.
Provisioning would finish and container status becomes "RUNNING"
It keeps constantly in PENDING status (5 minutes at least) until it throws the error
no log available - fargate does not provide any log while provisioning docker containers
Same problem here. I also tried to pull from our private registry, but no option to get the credentials into fargate.
I am sorry to hear you are having problems.
The error you are seeing below is commonly due to lack of internet access to pull the image. The image pull occurs over the network interface used by the Task, and as such shares security group and routing rules.
Please check your configuration for the following:
If neither of those networking changes apply to you or if they do not fix your problem, please let us know so we can further assist.
For anyone else who drops by here:
I wrestled with this for a while until I figured out that, in addition to what @samuelkarp said above, I needed to add AssignPublicIp: Enabled to my network configuration. After adding this, I stopped getting the Client.Timeout exceeded while awaiting headers error.
NetworkConfiguration:
AwsvpcConfiguration:
AssignPublicIp: 'ENABLED'
@tklovett thanks!
@samuelkarp How are we supposed to prevent access to the public IP then?
@byF Security groups provide customizable rules to control inbound and outbound traffic.
I followed @samuelkarp 's instructions but that didn't help until I started a service with a public IP as @tklovett suggested. I don't understand why this should be the case--my service should not be open to the internet yet if I want to deploy any image it requires internet access which is only given if you make the service public? This seems like very poor security practice...
Edit: just saw the last two comments. Perhaps I don't understand this, but from a usability perspective I would like for the service to not have a public interface at all because it will never need it. But it looks like for this purpose it must have one. It is a mismatch from how things are done in EC2, where instances can be made private and no one has to worry about anything (like say, someone editing the security group, note that you can't add a public interface to an EC2 instance after it has been started).
@hadsed In order to pull the image, your ENI must have access to the registry. For Docker Hub and for Amazon ECR, this means your ENI must have access to reach the Internet. You can achieve access to the Internet in a few different ways, but the most common are an Internet Gateway and public IP address or using NAT and a private IP address. For NAT, you can use NAT instances or a NAT gateway.
If you want to disable Internet access entirely, you'll need to use a registry located inside your VPC instead of a registry that requires Internet access.
It's just very limiting that I cannot restrict access to my services that may not be hardened against all types of internet traffic (that's what frontends like nginx, AWS ELB, etc. are for). So you can see how this is a problem: I either have to run my own registry (AWS ECR being useless for this case now) or I have to harden every service I'll ever deploy because it'll be open to the internet.
@hadsed Security groups provide customizable rules to control inbound and outbound traffic. You can also choose to use NAT instead of adding a public IP address which will also let you restrict inbound traffic.
@hadsed yeah, I was like "wtf" first as well, then I found out thanks to
Sam's point you can limit the inbound traffic source to the security group
itself - which means even though there is a public IP, outside traffic will
be blocked:
On Tue, Jan 2, 2018 at 11:20 PM, Samuel Karp notifications@github.com
wrote:
@hadsed https://github.com/hadsed Security groups
http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_SecurityGroups.html
provide customizable rules to control inbound and outbound traffic. You can
also choose to use NAT instead of adding a public IP address which will
also let you restrict inbound traffic.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/aws/amazon-ecs-agent/issues/1128#issuecomment-354890741,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAkLKocJxlWnO5DkIW-98tyoZIv7Xw7Hks5tGqvDgaJpZM4Q1eT5
.

"You can also choose to use NAT instead of adding a public IP address which will also let you restrict inbound traffic."
How can this be configured? I'm launching the task into a private subnet with a route table to a NAT gateway in a public subnet. The VPC has an internet gateway. I can verify that EC2 instances in both the private and public subnet can pull the docker image (as they should be able to), but I'm still getting CannotPullContainerError. What am I missing?
EDIT: My problem was that the ECS service task's security group's outbound rule didn't allow pulling the image. I didn't notice that in Terraform a security group doesn't allow outbound traffic by default:
https://www.terraform.io/docs/providers/aws/r/security_group.html
@samuelkarp, as mentioned by @tklovett and @hadsed, without assigning a public IP, Fargate does not get access to ECR. I have 2 services configured in exactly the same way: same VPC, same subnets, same security groups. 0.0.0.0 is pointing to an IGW. ACL rules allow all outbound traffic. The only difference - first service has Auto-assign public IP ENABLED, the second DISABLED. The first one successfully starts the task, the second one fails with CannotPullContainer exception. Could you, please consider reopening this ticket?
@afedulov You either need private IP + NAT or a public IP + IGW. In your example, the task that's failing has neither NAT nor a public IP. My earlier comment has the full information.
I'm going to lock this issue; please see this comment and this comment before opening a new issue.
AWS Privatelink support was released in Feb 2019: https://www.infoq.com/news/2019/02/aws-privatelink-ecr-ecs/
You can refer to the vpc endpoint guides here
ECS https://docs.aws.amazon.com/AmazonECS/latest/developerguide/vpc-endpoints.html
ECR https://docs.aws.amazon.com/AmazonECR/latest/userguide/vpc-endpoints.html
Most helpful comment
For anyone else who drops by here:
I wrestled with this for a while until I figured out that, in addition to what @samuelkarp said above, I needed to add
AssignPublicIp: Enabledto my network configuration. After adding this, I stopped getting theClient.Timeout exceeded while awaiting headerserror.