Eksctl: Failed to create private cluster in AWS China regions

Created on 16 Jan 2021  路  9Comments  路  Source: weaveworks/eksctl

What were you trying to accomplish?
create private cluster in AWS China regions(Beijing and Ningxia)

What happened?
Failed with below error message:

[鉁朷  error adding resources for VPC endpoints: error building endpoint service details: error describing VPC endpoint services: InvalidServiceName: The Vpc Endpoint Service 'cn.com.amazonaws.cn-northwest-1.s3' does not exist
        status code: 400, request id: 6fdefd95-7a31-4adf-bd4b-2a3a3195794a

How to reproduce it?

create cluster with below command and config file:

eksctl create cluster -f ./eks-private-cluster.yaml --region cn-northeast-1

eks-private-cluster.yaml:

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: private-test
  region: cn-northwest-1

privateCluster:
  enabled: true
  additionalEndpointServices:
  - "autoscaling"

managedNodeGroups:
- name: ng1
  instanceType: c5.xlarge
  desiredCapacity: 1
  privateNetworking: true

Logs

[鈩筣  eksctl version 0.32.0
[鈩筣  using region cn-northwest-1
[鈩筣  setting availability zones to [cn-northwest-1c cn-northwest-1b cn-northwest-1a]
[鈩筣  subnets for cn-northwest-1c - public:192.168.0.0/19 private:192.168.96.0/19
[鈩筣  subnets for cn-northwest-1b - public:192.168.32.0/19 private:192.168.128.0/19
[鈩筣  subnets for cn-northwest-1a - public:192.168.64.0/19 private:192.168.160.0/19
[鈩筣  using Kubernetes version 1.18
[鈩筣  creating EKS cluster "private-test" in "cn-northwest-1" region with managed nodes
[鈩筣  1 nodegroup (ng1) was included (based on the include/exclude rules)
[鈩筣  will create a CloudFormation stack for cluster itself and 0 nodegroup stack(s)
[鈩筣  will create a CloudFormation stack for cluster itself and 1 managed nodegroup stack(s)
[鈩筣  if you encounter any issues, check CloudFormation console or try 'eksctl utils describe-stacks --region=cn-northwest-1 --cluster=private-test'
[鈩筣  CloudWatch logging will not be enabled for cluster "private-test" in "cn-northwest-1"
[鈩筣  you can enable it with 'eksctl utils update-cluster-logging --enable-types={SPECIFY-YOUR-LOG-TYPES-HERE (e.g. all)} --region=cn-northwest-1 --cluster=private-test'
[鈩筣  Kubernetes API endpoint access will use provided values {publicAccess=true, privateAccess=true} for cluster "private-test" in "cn-northwest-1"
[鈩筣  2 sequential tasks: { create cluster control plane "private-test", 2 sequential sub-tasks: { update cluster VPC endpoint access configuration, create managed nodegroup "ng1" } }
[鈩筣  building cluster stack "eksctl-private-test-cluster"
[!]  1 error(s) occurred and cluster hasn't been created properly, you may wish to check CloudFormation console
[鈩筣  to cleanup resources, run 'eksctl delete cluster --region=cn-northwest-1 --name=private-test'
[鉁朷  error adding resources for VPC endpoints: error building endpoint service details: error describing VPC endpoint services: InvalidServiceName: The Vpc Endpoint Service 'cn.com.amazonaws.cn-northwest-1.s3' does not exist
        status code: 400, request id: 6fdefd95-7a31-4adf-bd4b-2a3a3195794a

Anything else we need to know?

The S3 Vpc Endpoint Service should be "com.amazonaws.cn-northwest-1.s3", no "cn." prefix.
Relevant code is here

Versions

$ eksctl version
0.32.0
$ kubectl version
kinbug prioritcritical

Most helpful comment

Hi @walkley, unfortunately we don't have access to the China partition. Are you able to clarify why the S3 endpoint leaves out the prefix while the ECR endpoint needs it? (#2568)
It's not clear to me from the AWS documentation. It makes me concerned the same problem might be hiding elsewhere?

I can't get any documentation for each VPC endpoint service name either. So I extracted them from AWS China VPC endpoint console, the full list in Beijing and Ningxia region.

For the 8 VPC endpoint services used by eksctl, below is the list for Beijing and Ningxia region:

cn.com.amazonaws.cn-north-1.ec2
cn.com.amazonaws.cn-north-1.ecr.api
cn.com.amazonaws.cn-north-1.ecr.dkr
com.amazonaws.cn-north-1.s3
cn.com.amazonaws.cn-north-1.sts
cn.com.amazonaws.cn-north-1.cloudformation
cn.com.amazonaws.cn-north-1.autoscaling
com.amazonaws.cn-north-1.logs

cn.com.amazonaws.cn-northwest-1.ec2
cn.com.amazonaws.cn-northwest-1.ecr.api
cn.com.amazonaws.cn-northwest-1.ecr.dkr
com.amazonaws.cn-northwest-1.s3
cn.com.amazonaws.cn-northwest-1.sts
cn.com.amazonaws.cn-northwest-1.cloudformation
cn.com.amazonaws.cn-northwest-1.autoscaling
com.amazonaws.cn-northwest-1.logs

There's no '.cn' prefix for s3 and logs, it may need to be hard coded as I don't see any pattern for these service names.

All 9 comments

Thanks for reporting this @walkley and for identifying the problem in the code.

If you want to submit a fix we would be happy to accept, otherwise the team will get to this soon 馃憤

Hi @walkley, unfortunately we don't have access to the China partition. Are you able to clarify why the S3 endpoint leaves out the prefix while the ECR endpoint needs it? (https://github.com/weaveworks/eksctl/issues/2568)
It's not clear to me from the AWS documentation. It makes me concerned the same problem might be hiding elsewhere?

Hi @walkley, unfortunately we don't have access to the China partition. Are you able to clarify why the S3 endpoint leaves out the prefix while the ECR endpoint needs it? (#2568)
It's not clear to me from the AWS documentation. It makes me concerned the same problem might be hiding elsewhere?

I can't get any documentation for each VPC endpoint service name either. So I extracted them from AWS China VPC endpoint console, the full list in Beijing and Ningxia region.

For the 8 VPC endpoint services used by eksctl, below is the list for Beijing and Ningxia region:

cn.com.amazonaws.cn-north-1.ec2
cn.com.amazonaws.cn-north-1.ecr.api
cn.com.amazonaws.cn-north-1.ecr.dkr
com.amazonaws.cn-north-1.s3
cn.com.amazonaws.cn-north-1.sts
cn.com.amazonaws.cn-north-1.cloudformation
cn.com.amazonaws.cn-north-1.autoscaling
com.amazonaws.cn-north-1.logs

cn.com.amazonaws.cn-northwest-1.ec2
cn.com.amazonaws.cn-northwest-1.ecr.api
cn.com.amazonaws.cn-northwest-1.ecr.dkr
com.amazonaws.cn-northwest-1.s3
cn.com.amazonaws.cn-northwest-1.sts
cn.com.amazonaws.cn-northwest-1.cloudformation
cn.com.amazonaws.cn-northwest-1.autoscaling
com.amazonaws.cn-northwest-1.logs

There's no '.cn' prefix for s3 and logs, it may need to be hard coded as I don't see any pattern for these service names.

I would be happy to validate the fix if you don't have access to AWS China regions.

@walkley :pray: Thanks so much! The fix is now merged to master, if you can validate with master that'd be awesome, otherwise on Friday we'll make a release.

I just downloaded the artifact for the commit from circleci and validated the private VPC in Ningxia region, it works well!
Thank you so much!

eksctl create cluster -f ./private-vpc-test.yaml

private-vpc-test.yaml:

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: private-test
  region: cn-northwest-1

privateCluster:
  enabled: true
  additionalEndpointServices:
  - "autoscaling"
  - "cloudformation"
  - "logs"

managedNodeGroups:
- name: ng1
  instanceType: c5.xlarge
  desiredCapacity: 1
  privateNetworking: true

output of eksctl:

[鈩筣  eksctl version 0.37.0-dev+c3f38940.2021-01-20T13:58:46Z
[鈩筣  using region cn-northwest-1
[鈩筣  setting availability zones to [cn-northwest-1b cn-northwest-1c cn-northwest-1a]
[鈩筣  subnets for cn-northwest-1b - public:192.168.0.0/19 private:192.168.96.0/19
[鈩筣  subnets for cn-northwest-1c - public:192.168.32.0/19 private:192.168.128.0/19
[鈩筣  subnets for cn-northwest-1a - public:192.168.64.0/19 private:192.168.160.0/19
[鈩筣  using Kubernetes version 1.18
[鈩筣  creating EKS cluster "private-test" in "cn-northwest-1" region with managed nodes
[鈩筣  1 nodegroup (ng1) was included (based on the include/exclude rules)
[鈩筣  will create a CloudFormation stack for cluster itself and 0 nodegroup stack(s)
[鈩筣  will create a CloudFormation stack for cluster itself and 1 managed nodegroup stack(s)
[鈩筣  if you encounter any issues, check CloudFormation console or try 'eksctl utils describe-stacks --region=cn-northwest-1 --cluster=private-test'
[鈩筣  CloudWatch logging will not be enabled for cluster "private-test" in "cn-northwest-1"
[鈩筣  you can enable it with 'eksctl utils update-cluster-logging --enable-types={SPECIFY-YOUR-LOG-TYPES-HERE (e.g. all)} --region=cn-northwest-1 --cluster=private-test'
[鈩筣  Kubernetes API endpoint access will use provided values {publicAccess=true, privateAccess=true} for cluster "private-test" in "cn-northwest-1"
[鈩筣  2 sequential tasks: { create cluster control plane "private-test", 3 sequential sub-tasks: { update cluster VPC endpoint access configuration, create addons, create managed nodegroup "ng1" } }
[鈩筣  building cluster stack "eksctl-private-test-cluster"
[鈩筣  deploying stack "eksctl-private-test-cluster"
[鈩筣  waiting for CloudFormation stack "eksctl-private-test-cluster"
[鈩筣  waiting for CloudFormation stack "eksctl-private-test-cluster"
...
[鈩筣  waiting for requested "EndpointAccessUpdate" in cluster "private-test" to succeed
[鈩筣  waiting for requested "EndpointAccessUpdate" in cluster "private-test" to succeed
...
[鈩筣  building managed nodegroup stack "eksctl-private-test-nodegroup-ng1"
[鈩筣  deploying stack "eksctl-private-test-nodegroup-ng1"
[鈩筣  waiting for CloudFormation stack "eksctl-private-test-nodegroup-ng1"
[鈩筣  waiting for CloudFormation stack "eksctl-private-test-nodegroup-ng1"
...
[鈩筣  waiting for the control plane availability...
[鉁擼  saved kubeconfig as "/home/ec2-user/.kube/config"
[鈩筣  no tasks
[鉁擼  all EKS cluster resources for "private-test" have been created
[鈩筣  nodegroup "ng1" has 1 node(s)
[鈩筣  node "ip-192-168-190-216.cn-northwest-1.compute.internal" is ready
[鈩筣  waiting for at least 1 node(s) to become ready in "ng1"
[鈩筣  nodegroup "ng1" has 1 node(s)
[鈩筣  node "ip-192-168-190-216.cn-northwest-1.compute.internal" is ready
[鈩筣  kubectl command should work with "/home/ec2-user/.kube/config", try 'kubectl get nodes'
[鈩筣  disabling public endpoint access for the cluster
[鈩筣  waiting for requested "EndpointAccessUpdate" in cluster "private-test" to succeed
[鈩筣  waiting for requested "EndpointAccessUpdate" in cluster "private-test" to succeed
...
[鈩筣  fully private cluster "private-test" has been created. For subsequent operations, eksctl must be run from within the cluster's VPC, a peered VPC or some other means like AWS Direct Connect
[鉁擼  EKS cluster "private-test" in "cn-northwest-1" region is ready

Hello,

also in the region "Europe (Milan) eu-south-1" there is a bug similar to the one described above.

This is the yaml file used:

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: cloudbees-ci
  region: eu-south-1

vpc:
  id: "vpc-010798c958f4bfdff"
  subnets:
    private:
      eu-south-1a:
          id: "subnet-04ecf032c30889df0"
      eu-south-1b:
          id: "subnet-0da7f1efbf13add01"

privateCluster:
  enabled: true
  additionalEndpointServices:
  - "autoscaling"
  - "cloudformation"
  - "logs"

nodeGroups:
  - name: ng-1
    instanceType: m5.large
    desiredCapacity: 1
    privateNetworking: true


fargateProfiles:
  - name: fp-default
    selectors:
      - namespace: default
      - namespace: kube-system
  - name: fp-devops
    selectors:
      - namespace: devops

cloudWatch:
  clusterLogging:
    enableTypes: ["*"]

These are the logs:

[鈩筣  eksctl version 0.36.1
[鈩筣  using region eu-south-1
[鉁擼  using existing VPC (vpc-010798c958f4bfdff) and subnets (private:[subnet-0da7f1efbf13add01 subnet-04ecf032c30889df0] public:[])
[!]  custom VPC/subnets will be used; if resulting cluster doesn't function as expected, make sure to review the configuration of VPC/subnets
[鈩筣  nodegroup "ng-1" will use "ami-066461a96ead1ce53" [AmazonLinux2/1.18]
[鈩筣  using Kubernetes version 1.18
[鈩筣  creating EKS cluster "cloudbees-ci" in "eu-south-1" region with Fargate profile and un-managed nodes
[鈩筣  1 nodegroup (ng-1) was included (based on the include/exclude rules)
[鈩筣  will create a CloudFormation stack for cluster itself and 1 nodegroup stack(s)
[鈩筣  will create a CloudFormation stack for cluster itself and 0 managed nodegroup stack(s)
[鈩筣  if you encounter any issues, check CloudFormation console or try 'eksctl utils describe-stacks --region=eu-south-1 --cluster=cloudbees-ci'
[鈩筣  Kubernetes API endpoint access will use provided values {publicAccess=true, privateAccess=true} for cluster "cloudbees-ci" in "eu-south-1"
[鈩筣  2 sequential tasks: { create cluster control plane "cloudbees-ci", 3 sequential sub-tasks: { 3 sequential sub-tasks: { update CloudWatch logging configuration, update cluster VPC endpoint access configuration, create fargate profiles }, create addons, create nodegroup "ng-1" } }
[鈩筣  building cluster stack "eksctl-cloudbees-ci-cluster"
[!]  1 error(s) occurred and cluster hasn't been created properly, you may wish to check CloudFormation console
[鈩筣  to cleanup resources, run 'eksctl delete cluster --region=eu-south-1 --name=cloudbees-ci'
[鉁朷  error adding resources for VPC endpoints: error building endpoint service details: error describing VPC endpoint services: InvalidServiceName: The Vpc Endpoint Service 'com.amazonaws.eu-south-1.ecr.dkr' does not exist
        status code: 400, request id: cbdf3f57-4b30-41c8-83b4-7009a71042e3
Error: failed to create cluster "cloudbees-ci"

Thanks for reporting this @AndreaGal95, I have created a new issue to track the fix for this region 馃憤

Thanks for reporting this @AndreaGal95, I have created a new issue to track the fix for this region 馃憤

When do you think the bug will be fixed? currently, to work around the problem, I created the cluster as 'Public', without using 'privateCluster' and 'nodeGroups', and I added these specifications later, directly in the AWS console.

Thanks in advance.

Was this page helpful?
0 / 5 - 0 ratings