Reported by @AndreaGal95 in https://github.com/weaveworks/eksctl/issues/3081
What were you trying to accomplish?
Trying to create a cluster in eu-south-1 region
What happened?
Failed with:
services: InvalidServiceName: The Vpc Endpoint Service 'com.amazonaws.eu-south-1.ecr.dkr' does not exist
How to reproduce it?
Create a cluster in region eu-south-1
Config:
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: cloudbees-ci
region: eu-south-1
vpc:
id: "vpc-010798c958f4bfdff"
subnets:
private:
eu-south-1a:
id: "subnet-04ecf032c30889df0"
eu-south-1b:
id: "subnet-0da7f1efbf13add01"
privateCluster:
enabled: true
additionalEndpointServices:
- "autoscaling"
- "cloudformation"
- "logs"
nodeGroups:
- name: ng-1
instanceType: m5.large
desiredCapacity: 1
privateNetworking: true
fargateProfiles:
- name: fp-default
selectors:
- namespace: default
- namespace: kube-system
- name: fp-devops
selectors:
- namespace: devops
cloudWatch:
clusterLogging:
enableTypes: ["*"]
Logs
[鈩筣 eksctl version 0.36.1
[鈩筣 using region eu-south-1
[鉁擼 using existing VPC (vpc-010798c958f4bfdff) and subnets (private:[subnet-0da7f1efbf13add01 subnet-04ecf032c30889df0] public:[])
[!] custom VPC/subnets will be used; if resulting cluster doesn't function as expected, make sure to review the configuration of VPC/subnets
[鈩筣 nodegroup "ng-1" will use "ami-066461a96ead1ce53" [AmazonLinux2/1.18]
[鈩筣 using Kubernetes version 1.18
[鈩筣 creating EKS cluster "cloudbees-ci" in "eu-south-1" region with Fargate profile and un-managed nodes
[鈩筣 1 nodegroup (ng-1) was included (based on the include/exclude rules)
[鈩筣 will create a CloudFormation stack for cluster itself and 1 nodegroup stack(s)
[鈩筣 will create a CloudFormation stack for cluster itself and 0 managed nodegroup stack(s)
[鈩筣 if you encounter any issues, check CloudFormation console or try 'eksctl utils describe-stacks --region=eu-south-1 --cluster=cloudbees-ci'
[鈩筣 Kubernetes API endpoint access will use provided values {publicAccess=true, privateAccess=true} for cluster "cloudbees-ci" in "eu-south-1"
[鈩筣 2 sequential tasks: { create cluster control plane "cloudbees-ci", 3 sequential sub-tasks: { 3 sequential sub-tasks: { update CloudWatch logging configuration, update cluster VPC endpoint access configuration, create fargate profiles }, create addons, create nodegroup "ng-1" } }
[鈩筣 building cluster stack "eksctl-cloudbees-ci-cluster"
[!] 1 error(s) occurred and cluster hasn't been created properly, you may wish to check CloudFormation console
[鈩筣 to cleanup resources, run 'eksctl delete cluster --region=eu-south-1 --name=cloudbees-ci'
[鉁朷 error adding resources for VPC endpoints: error building endpoint service details: error describing VPC endpoint services: InvalidServiceName: The Vpc Endpoint Service 'com.amazonaws.eu-south-1.ecr.dkr' does not exist
status code: 400, request id: cbdf3f57-4b30-41c8-83b4-7009a71042e3
Error: failed to create cluster "cloudbees-ci"
Versions
0.36.1
(pasting in from other issue to keep context in this thread)
When do you think the bug will be fixed? currently, to work around the problem, I created the cluster as 'Public', without using 'privateCluster' and 'nodeGroups', and I added these specifications later, directly in the AWS console.
Thanks in advance.
@AndreaGal95 someone should get to it today
The ecr.api endpoint does not show up in a DescribeVPCEndpointServices call for eu-south-1, even though it's mentioned in this list under Service Endpoints (https://docs.aws.amazon.com/general/latest/gr/ecr.html):
Europe (Milan) | eu-south-1 | api.ecr.eu-south-1.amazonaws.com
(Although the ecr.dkr endpoint is missing in that list for eu-south-1).
These endpoints are required for a fully-private cluster to work because ECR hosts the manifest for the container images for the CNI plugin and other AWS addons. It might be the case that these endpoints are not supported in that region, or need to be explicitly enabled somehow.
The
ecr.apiendpoint does not show up in aDescribeVPCEndpointServicescall foreu-south-1, even though it's mentioned in this list under Service Endpoints (https://docs.aws.amazon.com/general/latest/gr/ecr.html):Europe (Milan) | eu-south-1 | ecr.eu-south-1.amazonaws.comapi.ecr.eu-south-1.amazonaws.com
(Although the
ecr.dkrendpoint is missing in that list foreu-south-1).These endpoints are required for a fully-private cluster to work because ECR hosts the manifest for the container images for the CNI plugin and other AWS addons. It might be the case that these endpoints are not supported in that region, or need to be explicitly enabled somehow.
Consider that I have directly changed on the AWS Console the fact that the cluster is entirely private and everything works fine.
Consider that I have directly changed on the AWS Console the fact that the cluster is entirely private and everything works fine.
@AndreaGal95, I believe you merely changed the API Endpoint Access setting to Private. That doesn't make the cluster fully private, it only makes the API server endpoint accessible from only within the VPC. Your nodes still have access to the internet.
The fully-private cluster feature in eksctl also launches your nodegroups in fully-private subnets that have no route to an internet gateway (either directly or via a NAT gateway). In order to support this, it uses VPC endpoints.
Looking at https://docs.aws.amazon.com/general/latest/gr/ecr.html#ecr_region it appears there is _no_ ECR DKR service at all for eu-south-1.
@AndreaGal95 Update: VPC endpoints for ECR are not supported in eu-south-1 yet, so until then eksctl can't support fully-private clusters in that region. We can, however, improve the error message to reflect that.
@cPu1 can we have a warning in the docs for that as well plz?