kops 🚀 - Can't create cluster using existing subnet

I think this is a duplicate.... No time to look right now. Can you check the umbrella issue?

chrislovecnm on 20 Dec 2016

@chrislovecnm I guess you're thinking of https://github.com/kubernetes/kops/issues/1147. Sounds a bit different since that issue is only for private topologies, while I can't even get it to work with public topologies. But they may be related.

yissachar on 20 Dec 2016

Yah something may be broken with the way that we match ids. Can you dump a log? I am adding this to the umbrella issue. Thanks for reporting btw. Have you used this functionality in the past?

Trying to determine when this was broken.

chrislovecnm on 21 Dec 2016

@justinsb can you take a look at this?

chrislovecnm on 21 Dec 2016

Yes I've used this successfully in the past.

Steps I followed:

Create a VPC
Enable DNS Hostnames as per the requirment
Create a subnet in that VPC
Create InternetGateway and attach to VPC
Create cluster:

kops create cluster   \
   --kubernetes-version=1.4.0 \
   --cloud=aws   \
   --name=foo.bar.com \
   --vpc=<id> \
   --network-cidr=<CIDR> \
   --zones=us-east-1a \
   --state=<s3>

Edit cluster. Change Subnet CIDR and ID.
Update cluster:

kops update cluster --name=foo.bar.com --state=<state> --yes

There are no error messages, but the cluster is not properly brought up.

yissachar on 21 Dec 2016

@justinsb any update on above issue?

hridyeshpant on 5 Jan 2017

+1 here. im trying to create the cluster into a specific VPC/subnets

followed steps as above, however, while it creates nodes/masters into proper locations, i believe its failing to create ELBs and route53 entries for the cluster

hollowimage on 18 Jan 2017

👍2 😕1

Just tried this and _was_ able to get it to work. Though I am using 1.5.0 alpha4, or the branch that will become it.

Also, some improvements coming: #1540

One thing to check is that the subnet zone matches the zone you specify. Could this have been the problem @yissachar

justinsb on 19 Jan 2017

I'm using kops 1.4.4 and had same issue. Below was what I found.

Exist resources:

VPC id: vpc-xxxxxxxx
Route table id: rtb-xxxxxxx1
Subnets id: subnet-xxxxxxxa,subnet-xxxxxxxb,subnet-xxxxxxxc

Steps:

kops create cluster \
--cloud=aws \
--zones=us-west-2a,us-west-2b,us-west-2c \
--node-size=t2.micro \
--master-size=t2.micro \
--node-count=1 \
--dns-zone=foo.bar \
--name=uswest2.foo.bar \
--vpc=vpc-xxxxxxxx \
--network-cidr=172.10.0.0/16 \
--ssh-public-key=~/.ssh/id_rsa.pub

kops edit cluster uswest2.foo.bar

change

  - cidr: 172.10.32.0/19
    name: us-west-2a
  - cidr: 172.10.64.0/19
    name: us-west-2b
  - cidr: 172.10.96.0/19
    name: us-west-2c

to something like this

  - cidr: 172.10.32.0/19
    id: subnet-xxxxxxxa
    name: us-west-2a
  - cidr: 172.10.64.0/19
    id: subnet-xxxxxxxb
    name: us-west-2b
  - cidr: 172.10.96.0/19
    id: subnet-xxxxxxxc
    name: us-west-2c

kops update cluster uswest2.foo.bar --yes

aws ec2 describe-route-tables --filters Name=vpc-id,Values=vpc-xxxxxxxx

output:

{
    "RouteTables": [
        {
            "Associations": [
                {
                    "RouteTableAssociationId": "rtbassoc-xxxxxxxx",
                    "Main": true,
                    "RouteTableId": "rtb-xxxxxxx1"
                }
            ],
            "RouteTableId": "rtb-xxxxxxx1",
            "VpcId": "vpc-xxxxxxxx",
            "PropagatingVgws": [],
            "Routes": [
                ...
            ]
        },
        {
            "RouteTableId": "rtb-xxxxxxx2",
            "VpcId": "vpc-xxxxxxxx",
            "PropagatingVgws": [],
            "Tags": [
                {
                    "Value": "uswest2.foo.bar",
                    "Key": "KubernetesCluster"
                },
                {
                    "Value": "uswest2.foo.bar",
                    "Key": "Name"
                }
            ],
            "Routes": [
                ...
            ]
        }
    ]
}

Kops will created new route table rtb-xxxxxxx2, but not automatically associate subnets, have to manually associate subnets to new route table rtb-xxxxxxx2 and everything fine.

Same as when delete cluster, MUST be disassociate subnets first then route table will delete automatically or kops will give up to finish delete.

nidgetgod on 20 Jan 2017

@nidgetgod this is probably fixed in https://github.com/kubernetes/kops/releases/tag/1.5.0-alpha4 you mind verifying?

chrislovecnm on 24 Jan 2017

@chrislovecnm i can confirm and still not working with 1.5.0-alpha4 , existing subnet need to add manually in route table.

hridyeshpant on 24 Jan 2017

@hridyeshpant can I get

kops version
full create command / process that you are using

chrislovecnm on 24 Jan 2017

@chrislovecnm
kops version
Version 1.5.0-alpha4

kops create cluster --v=3 --state=s3://ewe-kubernetes-test --cloud=aws
--name=kubetnets.us-west-2.dev.XXXX.com
--ssh-public-key=~/.ssh/test.pub
--master-size=t2.medium
--master-zones=us-west-2c,us-west-2a,us-west-2b
--network-cidr=10.38.0.0/16
--vpc=vpc-7ca82ffb
--node-count=3
--node-size=t2.micro
--topology private
--networking weave
--zones=us-west-2c,us-west-2a,us-west-2b
kops edit cluster kubetnets.us-west-2.dev.XXXX.com --state=s3://ewe-kubernetes-test
update subnet section in configuration
subnets:
- cidr: 10.38.88.0/21
  
  name: us-west-2c
  
  type: Private
  
  id: subnet-827750da
  
  zone: us-west-2c
- cidr: 10.38.96.0/21
  
  name: us-west-2a
  
  type: Private
  
  id: subnet-77de7b10
  
  zone: us-west-2a
- cidr: 10.38.104.0/21
  
  name: us-west-2b
  
  type: Private
  
  id: subnet-78b64931
  
  zone: us-west-2b
- cidr: 10.38.8.0/22
  
  name: utility-us-west-2c
  
  type: Utility
  
  zone: us-west-2c
- cidr: 10.38.0.0/22
  
  name: utility-us-west-2a
  
  type: Utility
  
  zone: us-west-2a
- cidr: 10.38.4.0/22
  
  name: utility-us-west-2b
  
  type: Utility
  
  zone: us-west-2b
  
  please note i am not updating type: Utility subnets

kops update cluster kubetnets.us-west-2.XXXX.expedia.com -v=3 --state=s3://ewe-kubernetes-test --yes

RESULT: new route_tabe created but custom subnets e.g subnet-77de7b10, subnet-827750da and subnet-78b64931 did not associated with route table, but other default subnets "type: Utility" are getting added.

hridyeshpant on 24 Jan 2017

@hridyeshpant that is just what the doctor ordered :)

@kris-nova see above

chrislovecnm on 24 Jan 2017

@chrislovecnm Same as @hridyeshpant result, still not working.

And got new error message never seem before:

I0124 13:00:48.006731   30989 executor.go:91] Tasks: 0 done / 52 total; 26 can run
I0124 13:00:48.461777   30989 vfs_castore.go:422] Issuing new certificate: "master"
I0124 13:00:48.995274   30989 vfs_castore.go:422] Issuing new certificate: "kubecfg"
I0124 13:00:49.613876   30989 vfs_castore.go:422] Issuing new certificate: "kubelet"
I0124 13:00:49.976573   30989 executor.go:91] Tasks: 26 done / 52 total; 11 can run
I0124 13:00:50.385197   30989 executor.go:91] Tasks: 37 done / 52 total; 13 can run
W0124 13:00:51.095478   30989 executor.go:109] error running task "LaunchConfiguration/master-us-west-2a.masters.uswest2.foo.bar" (9m59s remaining to succeed): IAM instance profile not yet created/propagated (original error: Invalid IamInstanceProfile: masters.uswest2.foo.bar)
W0124 13:00:51.095569   30989 executor.go:109] error running task "LaunchConfiguration/nodes.uswest2.foo.bar" (9m59s remaining to succeed): IAM instance profile not yet created/propagated (original error: Invalid IamInstanceProfile: nodes.uswest2.foo.bar)
I0124 13:00:51.095628   30989 executor.go:91] Tasks: 48 done / 52 total; 2 can run
W0124 13:00:51.415865   30989 executor.go:109] error running task "LaunchConfiguration/master-us-west-2a.masters.uswest2.foo.bar" (9m58s remaining to succeed): IAM instance profile not yet created/propagated (original error: Invalid IamInstanceProfile: masters.uswest2.foo.bar)
W0124 13:00:51.415957   30989 executor.go:109] error running task "LaunchConfiguration/nodes.uswest2.foo.bar" (9m58s remaining to succeed): IAM instance profile not yet created/propagated (original error: Invalid IamInstanceProfile: nodes.uswest2.foo.bar)
I0124 13:00:51.416004   30989 executor.go:124] No progress made, sleeping before retrying 2 failed task(s)
I0124 13:01:01.416228   30989 executor.go:91] Tasks: 48 done / 52 total; 2 can run
I0124 13:01:02.308902   30989 executor.go:91] Tasks: 50 done / 52 total; 2 can run
I0124 13:01:02.688324   30989 executor.go:91] Tasks: 52 done / 52 total; 0 can run
I0124 13:01:02.688418   30989 dns.go:140] Pre-creating DNS records
I0124 13:01:03.757112   30989 update_cluster.go:202] Exporting kubecfg for cluster
Wrote config for uswest2.foo.bar to "~/.kube/config"
Kops has changed your kubectl context to uswest2.foo.bar

nidgetgod on 24 Jan 2017

@chrislovecnm @kris-nova
okay finally playing more around , i dont think we really need to add custom subnets to kubernets route table which kubernetes is creating.
The route table kubernetes is creating is attaching 0.0.0.0/0 IGW,so association of subnets making them public.
when we are using own subnets, these subnets are already associated with default route table. This default route table have have nat association,so that private subnet can access internet.
if kuebernets is using own route table to associate these private subnet, then it should also create nats with public subnet and should use nats, rather the using 0.0.0.0/0 IGW in route table.

hridyeshpant on 24 Jan 2017

also is there way to tag the ELB, we are running some automation cleanup and deleting any resource which don't have proper tag format.
i am using cloudLabels for master/nodes and ASG, but dont know how to tag the public ELB.

hridyeshpant on 24 Jan 2017

So here's how shared subnets are supposed to work:

We shouldn't change them. So we won't associate them to a route table.
I think in practice this means that you have to specify all of them with an ID, or none of them. I'll add docs & validation.
Also I think we shouldn't create a route table if we're not creating any subnets. I'll add code

Can you add a separate issue for the ELB tagging please @hridyeshpant. I'm not entirely sure which ELB you're referring to, or which tag is missing ... I thought we tagged the API ELB, so this sounds like it could be a bug.

justinsb on 24 Jan 2017

I tried with 1.5.0-alpha4 but it's still broken for me. Route53 records are created but stuck with the pre-populated 203.0.113.123 address.

Commands I used are same as my last comment.

yissachar on 24 Jan 2017

@justinsb here are steps which i followed

created three private subnets and associated in default route table
default route table has nat with public subnet,so private subnet can reach internet.
run kops create cluster --v=3 --state=s3://ewe-kubernetes-test --cloud=aws
--name=kubetnets.us-west-2.dev.XXXX.com
--ssh-public-key=~/.ssh/test.pub
--master-size=t2.medium
--master-zones=us-west-2c,us-west-2a,us-west-2b
--network-cidr=10.38.0.0/16
--vpc=vpc-7ca82ffb
--node-count=3
--node-size=t2.micro
--topology private
--networking weave
--zones=us-west-2c,us-west-2a,us-west-2b
update subnet section in configuration, please note i am only using subnet-id only for "Type: Private", not for "Type: Utility" subnet.so there are three subnet with existing ids and other three is using default values.
subnets:
cidr: 10.38.88.0/21
name: us-west-2c
type: Private
id: subnet-827750da
zone: us-west-2c
cidr: 10.38.96.0/21
name: us-west-2a
type: Private
id: subnet-77de7b10
zone: us-west-2a
cidr: 10.38.104.0/21
name: us-west-2b
type: Private
id: subnet-78b64931
zone: us-west-2b
cidr: 10.38.8.0/22
name: utility-us-west-2c
type: Utility
zone: us-west-2c
cidr: 10.38.0.0/22
name: utility-us-west-2a
type: Utility
zone: us-west-2a
cidr: 10.38.4.0/22
name: utility-us-west-2b
type: Utility
zone: us-west-2b

kops update cluster kubetnets.us-west-2.XXXX.expedia.com -v=3 --state=s3://ewe-kubernetes-test --yes

Result:
1.Now kops create new route table and associating only those subnet which don't have ids (e.g "Type: Utility"). and this route table is using 0.0.0.0/0 IGW in route.
2.The existing subnets (see above with ids) already attached to default route table which has nats as route.
so everything is working fine. let me know if this is not the expected behaviour.

Earlier i missed to associated existing subnet to default route table.

hridyeshpant on 24 Jan 2017

@justinsb Same issue as @yissachar, using 1.5.0-alpha4. Of note is that when I told kops to use existing subnets, the ELB's were not created and none of the dns was updated with a real IP address. When kops created the subnets, it updated the etcd dns but still did not create the ELBs or update ELB dns. I didn't see any error messages in logs relating to load balancers.

nckturner on 25 Jan 2017

FYI : i am able to create cluster with private subnet using above steps and working fine.
@yissachar and @justinsb i have general question , why do we need utility subnets?
Could you please share link which have network digram how private topology is being used in kubernetes?
Sorry for asking this question here, let me know if there is any common discussion forum when i can used for such queries?

hridyeshpant on 25 Jan 2017

+1 the above question. I think the whole point of using existing subnets is that you have a network topology already defined, and creating mirrored utility subnets as a requirement seems kinda confusing, if not a little destructive.

hollowimage on 25 Jan 2017

👍1

@hridyeshpant In what we are calling "private topology", kops will place masters and nodes in private subnets. Then the egress for these private subnets will live in "utility" subnets. Currently, the only type of egress supported is a NAT Gateway(NGW)-so utility subnets are where the NGWs reside.

If you haven't located it by now, you're welcome to join the kubernetes slack. A good channel to check out is #sig-aws.

geojaz on 25 Jan 2017

Sorry for the duplicate issue. Issue #1600 can just be closed.

I just tried everything one more time. Unfortunately without any luck. I will post the details below:

What I'm trying to do
I want to create the following setup:
networking - optimal for github issue

I have created the following before running kops:

The VPC
3 private subnets and 3 public subnets
4 Route Tables (1 for each private subnet and 1 for all 3 public subnets as depicted in the diagram.)
3 NAT Gateways with elastic ip's (verified and tested with instanses in each of the private network to make sure this wasn't the issue)

After the initial networking configuration, i ran the following command with Kops version: Version 1.5.0-alpha4 (git-4b1307f).

kops create cluster \
    --node-count 3 \
    --zones eu-west-1a,eu-west-1b,eu-west-1c \
    --master-zones eu-west-1a,eu-west-1b,eu-west-1c \
    --dns-zone staging.XX.com \
    --node-size t2.large \
    --master-size t2.medium \
    --topology private \
    --networking weave \
    --vpc=${VPC_ID} \
    --image 293135079892/k8s-1.4-debian-jessie-amd64-hvm-ebs-2016-11-16 \
    --bastion \
    ${NAME}

Where the variables are the VPC_ID: vpc-xxxxxx and the NAME: kubernetes-staging-XX-com.

After the cluster completion i ran kops edit cluster ... and changed the subnet configuration as follows:

apiVersion: kops/v1alpha2
kind: Cluster
metadata:
  creationTimestamp: "2017-01-25T12:16:42Z"
  name: kubernetes.staging.XX.com
spec:
  api:
    loadBalancer:
      type: Public
  channel: stable
  cloudProvider: aws
  configBase: s3://kubernetes-staging-XX-com/kubernetes.staging.XX.com
  dnsZone: staging.XX.com
  etcdClusters:
  - etcdMembers:
    - instanceGroup: master-eu-west-1a
      name: eu-west-1a
    - instanceGroup: master-eu-west-1b
      name: eu-west-1b
    - instanceGroup: master-eu-west-1c
      name: eu-west-1c
    name: main
  - etcdMembers:
    - instanceGroup: master-eu-west-1a
      name: eu-west-1a
    - instanceGroup: master-eu-west-1b
      name: eu-west-1b
    - instanceGroup: master-eu-west-1c
      name: eu-west-1c
    name: events
  kubernetesApiAccess:
  - XX.XX.YY.YY/32
  kubernetesVersion: v1.4.7
  masterInternalName: api.internal.kubernetes.staging.XX.com
  masterPublicName: api.kubernetes.staging.XX.com
  networkCIDR: 10.0.0.0/16
  networkID: vpc-xxxxxxx
  networking:
    weave: {}
  nonMasqueradeCIDR: 100.64.0.0/10
  sshAccess:
  - XX.XX.YY.YY/32
  subnets:
  - cidr: 10.0.16.0/20
    egress: nat-XXXX1
    id: subnet-380e4861
    name: eu-west-1a
    type: Private
    zone: eu-west-1a
  - cidr: 10.0.32.0/20
    egress: nat-YYYY1
    id: subnet-926e68f7
    name: eu-west-1b
    type: Private
    zone: eu-west-1b
  - cidr: 10.0.48.0/20
    egress: nat-ZZZZ1
    id: subnet-161c2d60
    name: eu-west-1c
    type: Private
    zone: eu-west-1c
  - cidr: 10.0.0.0/20
    id: subnet-390e4860
    name: utility-eu-west-1a
    type: Utility
    zone: eu-west-1a
  - cidr: 10.0.64.0/20
    id: subnet-2f5e634b
    name: utility-eu-west-1b
    type: Utility
    zone: eu-west-1b
  - cidr: 10.0.80.0/20
    id: subnet-541c2d22
    name: utility-eu-west-1c
    type: Utility
    zone: eu-west-1c
  topology:
    bastion:
      bastionPublicName: bastion.kubernetes.staging.XX.com
    dns:
      type: Public
    masters: private
    nodes: private

Lastly, i ran the kops update cluster ... --yes.

OUTCOME

Cannot connect to the cluster.

kubectl gives the following error:

$ kubectl get pods
Unable to connect to the server: EOF

The api-loadbalancer status are 0 out of 3 instances in service

The DNS Entry in Route53 for: api.internal.kubernetes.staging.XX.com record do not contain the right IP. It has the value: 203.0.113.123

kaspernissen on 25 Jan 2017

@kaspernissen From an admittedly quick perusal, it looks like you've done everything correctly. I think this is a bug that @justinsb is attempting to fix in in #1603 which was merged last night.

Maybe you would be able to compile from the master branch and report back on if there were any changes?

geojaz on 25 Jan 2017

@geojaz the above descriptions was with the latest commit on the master branch from 3 hours ago.

$ kops version
Version 1.5.0-alpha4 (git-4b1307f)

Thank you for the quick reply.

kaspernissen on 25 Jan 2017

I just had a quick look at one of the master nodes, and found the following in the log of protokube:

error checking for required update: error querying namespace "kube-system": Get http://localhost:8080/api/v1/namespaces/kube-system: dial tcp [::1]:8080: getsockopt: connection refused
W0125 13:55:38.432350       1 kube_boot.go:129] error applying channel "s3://kubernetes-staging-XX-com/kubernetes.staging.XX.com/addons/bootstrap-channel.yaml": error running channels: exit status 1
I0125 13:56:38.432810       1 aws_volume.go:71] AWS API Request: ec2/DescribeVolumes
I0125 13:56:38.535353       1 tainter.go:53] Querying k8s for nodes with selector "kubernetes.io/role=master"
W0125 13:56:38.535798       1 kube_boot.go:115] error updating master taints: error querying nodes: Get http://localhost:8080/api/v1/nodes?labelSelector=kubernetes.io%2Frole%3Dmaster: dial tcp [::1]:8080: getsockopt: connection refused
I0125 13:56:38.538707       1 channels.go:47] checking channel: "s3://kubernetes-staging-XX-com/kubernetes.staging.XX.com/addons/bootstrap-channel.yaml"
I0125 13:56:38.538754       1 channels.go:34] Running command: channels apply channel s3://kubernetes-staging-XX-com/kubernetes.staging.XX.com/addons/bootstrap-channel.yaml --v=4 --yes
I0125 13:56:39.291637       1 channels.go:37] error running channels apply channel s3://kubernetes-staging-XX-com/kubernetes.staging.XX.com/addons/bootstrap-channel.yaml --v=4 --yes:
I0125 13:56:39.291672       1 channels.go:38] I0125 13:56:38.598389     167 root.go:89] No client config found; will use default config
I0125 13:56:38.598568     167 addons.go:36] Loading addons channel from "s3://kubernetes-staging-XX-com/kubernetes.staging.XX.com/addons/bootstrap-channel.yaml"
I0125 13:56:38.598639     167 s3context.go:84] Querying S3 for bucket location for "kubernetes-staging-XX-com"
I0125 13:56:38.967823     167 s3context.go:101] Found bucket "kubernetes-staging-XX-com" in region "us-east-1"
I0125 13:56:38.967855     167 s3fs.go:157] Reading file "s3://kubernetes-staging-XX-com/kubernetes.staging.XX.com/addons/bootstrap-channel.yaml"

error checking for required update: error querying namespace "kube-system": Get http://localhost:8080/api/v1/namespaces/kube-system: dial tcp [::1]:8080: getsockopt: connection refused
I0125 13:56:39.291853       1 channels.go:50] apply channel output was: I0125 13:56:38.598389     167 root.go:89] No client config found; will use default config
I0125 13:56:38.598568     167 addons.go:36] Loading addons channel from "s3://kubernetes-staging-XX-com/kubernetes.staging.XX.com/addons/bootstrap-channel.yaml"
I0125 13:56:38.598639     167 s3context.go:84] Querying S3 for bucket location for "kubernetes-staging-XX-com"
I0125 13:56:38.967823     167 s3context.go:101] Found bucket "kubernetes-staging-XX-com" in region "us-east-1"
I0125 13:56:38.967855     167 s3fs.go:157] Reading file "s3://kubernetes-staging-XX-com/kubernetes.staging.XX.com/addons/bootstrap-channel.yaml"

I think i might have created the bucket in the wrong region. Would that make a difference? I will give it a try with at bucket in eu-west as the rest of the infrastructure is running.

kaspernissen on 25 Jan 2017

@kaspernissen thanks for digging into it, and sorry about the problems.

Most of those errors from protokube are benign - it is trying to reach the apiserver, but it can't because it isn't up yet.

In a separate issue I wrote up a step-by-step of what should be happening:
https://github.com/kubernetes/kops/issues/1612#issuecomment-274970324

It would be great if you could see how far along that list your cluster got. Also, I am justinsb on the kubernetes slack if you want to go through step by step.

AFAICT you did everything right here, so if we can figure out what step I forgot we can hopefully just get this fixed!

justinsb on 25 Jan 2017

So good news / bad news @kaspernissen (et al):

Bad news: I couldn't reproduce. I did the same thing, and it "worked for me"
Good news: I'm using what will be beta1 (i.e. I have all my patches applied). So I think the thing to do here is to release beta1, and then we can see if this problem persists.

justinsb on 25 Jan 2017

Oh ... and @hridyeshpant / @hollowimage - I've added validation for this now in the next version, but you should either let kops create all the subnets, or have it create none of them. So you would need to pre-create your "utility" subnets as well, as @kaspernissen has done here.

justinsb on 25 Jan 2017

@yissachar FYI, I ended up realizing that I didn't associate the VPC with the hosted zone, and fixing that worked for me with 1.5.0-alpha4. The only place I saw an error message was in the protokube logs where it reported a dns zone issue. You might double check that you don't have the same issue.

nckturner on 25 Jan 2017

Is it necessary to specify the --image ? I was following this guide: https://www.nivenly.com/k8s-aws-private-networking/

@nckturner are you using private DNS, since you have to associate the VPC?

I will give it another try tomorrow. Great to hear that you got it working @justinsb - then I must have made a mistake somewhere. When will you release beta1?

kaspernissen on 25 Jan 2017

@kaspernissen Yeah, I am using private dns.

nckturner on 25 Jan 2017

@nckturner you dont need to specify image, make sure you are using kops 1.5.0-alpha4.

hridyeshpant on 26 Jan 2017

Or ... beta-1 :)

chrislovecnm on 26 Jan 2017

I finally managed to create a cluster in the existing VPC with preconfigured subnets and NATs.

I forgot to add the --channel alpha argument. After adding this with --dns-zone=<ID OF PRIVATE ZONE> the cluster came up. I did have some issues with the 2 public endpoints (bastion and api) being created in the private zone. I manually moved them to the public zone instead, and everything is working great.

BTW, this was with kops 1.5.0-beta1.

kaspernissen on 26 Jan 2017

I was able to get it to work with 1.5.0-beta1 and setting up my VPC/subnets from scratch. Not 100% sure if beta1 fixed things or there was some user error previously, or some combination of the two. Either way, I can verify that it works now with beta1!

yissachar on 26 Jan 2017

Awesome - thanks for confirming @yissachar - there were definitely some fixes in beta1 - we can infer the CIDRs from the ids now (for VPCs & Subnets) and I think we don't create a route table if we don't need it.

What we don't have is support for sharing a subnet between two kubernetes clusters: specifically we can't create ELBs. But that is actually at core a kubernetes issue - working on it under https://github.com/kubernetes/features/issues/173, and opened #1677 to track any kops work.

Tagging as documentation as I think that is all that is left!

justinsb on 29 Jan 2017

I'm using kops 1.5.1 + kubectl 1.5.2, with existing subnet it won't work(can't access service).

Environment

VPC: vpc-0000000 172.20.0.0/16
Subnet a: subnet-0000000a 172.20.0.0/18
Subnet b: subnet-0000000b 172.20.64.0/18
Subnet c: subnet-0000000c 172.20.128.0/18
Route Table: rtb-00000001

Steps

kops create cluster \
--cloud=aws \
--channel=stable \
--master-count=1 \
--master-size=t2.micro \
--zones=us-west-2a,us-west-2b,us-west-2c \
--node-count=1 \
--node-size=t2.micro \
--dns-zone=foo.bar \
--name=uswest2.foo.bar \
--vpc=vpc-0000000 \
--network-cidr=172.20.0.0/16 \
--associate-public-ip=true \
--ssh-public-key=~/.ssh/id_rsa.pub

kops edit cluster uswest2.foo.bar

Change this

  subnets:
  - cidr: 172.20.32.0/19
    name: us-west-2a
    type: Public
    zone: us-west-2a
  - cidr: 172.20.64.0/19
    name: us-west-2b
    type: Public
    zone: us-west-2b
  - cidr: 172.20.96.0/19
    name: us-west-2c
    type: Public
    zone: us-west-2c

into this

  subnets:
  - cidr: 172.20.0.0/18
    id: subnet-0000000a
    name: us-west-2a
    type: Public
    zone: us-west-2a
  - cidr: 172.20.64.0/18
    id: subnet-0000000b
    name: us-west-2b
    type: Public
    zone: us-west-2b
  - cidr: 172.20.128.0/18
    id: subnet-0000000c
    name: us-west-2c
    type: Public
    zone: us-west-2c

kops update cluster uswest2.foo.bar --yes



md5-6f317f0d6eebc91517564f73c8e0e468



kubectl create -f https://raw.githubusercontent.com/kubernetes/kops/master/addons/kubernetes-dashboard/v1.5.0.yaml



md5-9f818f67a4057bdf6685c9ad62f915ad



kops create cluster \
--cloud=aws \
--channel=stable \
--master-count=1 \
--master-size=t2.micro \
--zones=us-west-2a,us-west-2b,us-west-2c \
--node-count=1 \
--node-size=t2.micro \
--dns-zone=foo.bar \
--name=uswest2.foo.bar \
--vpc=vpc-0000000 \
--network-cidr=172.20.0.0/16 \
--associate-public-ip=true \
--ssh-public-key=~/.ssh/id_rsa.pub



md5-3fa47e5dab80e8c0e02d500eefdeaa53



kops edit cluster uswest2.foo.bar



md5-c15560cdc307282ea09c3cdb0192d8d2



  subnets:
  - cidr: 172.20.32.0/19
    name: us-west-2a
    type: Public
    zone: us-west-2a
  - cidr: 172.20.64.0/19
    name: us-west-2b
    type: Public
    zone: us-west-2b
  - cidr: 172.20.96.0/19
    name: us-west-2c
    type: Public
    zone: us-west-2c



md5-3d584a95baa9abf117d28afc7ce92b1d



  subnets:
  - cidr: 172.20.192.0/24
    name: us-west-2a
    type: Public
    zone: us-west-2a
  - cidr: 172.20.193.0/24
    name: us-west-2b
    type: Public
    zone: us-west-2b
  - cidr: 172.20.194.0/24
    name: us-west-2c
    type: Public
    zone: us-west-2c



md5-3fa47e5dab80e8c0e02d500eefdeaa53



kops update cluster uswest2.foo.bar --yes



md5-6f317f0d6eebc91517564f73c8e0e468



kubectl create -f https://raw.githubusercontent.com/kubernetes/kops/master/addons/kubernetes-dashboard/v1.5.0.yaml

Result

Access into https://api.uswest2.foo.bar/ui smoothly, and route table got some change.

It will created another route table with Main=no, in Routes tab add two routes in Target column are eni-xxxxxxx1 eni-xxxxxxx2(one for master EC2, one for node EC2)

nidgetgod on 9 Feb 2017

I'm seeing what @kaspernissen was seeing here when i spin up a new cluster on my existing vpc and subnets

kops create cluster \
  --node-count=3 \
  --zones=us-east-1d,us-east-1c,us-east-1b \
  --master-zones=us-east-1d,us-east-1b,us-east-1c \
  --dns-zone=sugarcrm.io \
  --name="test.k8s.sugarcrm.io" \
  --vpc="vpc-7c41c919" \
  --network-cidr=10.27.16.0/20 \
  --topology=private \
  --networking=weave \
  --channel=alpha

I edit the config to put in my subnet id and egress ids for the subnet and end up with this

apiVersion: kops/v1alpha2
kind: Cluster
metadata:
  creationTimestamp: "2017-02-09T16:07:25Z"
  name: test.k8s.sugarcrm.io
spec:
  api:
    loadBalancer:
      type: Internal
  channel: alpha
  cloudProvider: aws
  configBase: s3://engtools-k8sdefinitions/test.k8s.sugarcrm.io
  dnsZone: sugarcrm.io
  etcdClusters:
  - etcdMembers:
    - instanceGroup: master-us-east-1d
      name: d
    - instanceGroup: master-us-east-1b
      name: b
    - instanceGroup: master-us-east-1c
      name: c
    name: main
  - etcdMembers:
    - instanceGroup: master-us-east-1d
      name: d
    - instanceGroup: master-us-east-1b
      name: b
    - instanceGroup: master-us-east-1c
      name: c
    name: events
  kubernetesApiAccess:
  - 0.0.0.0/0
  kubernetesVersion: 1.5.2
  masterInternalName: api.internal.test.k8s.sugarcrm.io
  masterPublicName: api.test.k8s.sugarcrm.io
  networkCIDR: 10.27.16.0/20
  networkID: vpc-7c41c919
  networking:
    weave: {}
  nonMasqueradeCIDR: 100.64.0.0/10
  sshAccess:
  - 0.0.0.0/0
  subnets:
  - cidr: 10.27.28.0/23
    egress: nat-0cce336cb2d205bbf
    id: subnet-ec6ea5b5
    name: us-east-1d
    type: Private
    zone: us-east-1d
  - cidr: 10.27.20.0/23
    egress: nat-0cce336cb2d205bbf
    id: subnet-838129f4
    name: us-east-1c
    type: Private
    zone: us-east-1c
  - cidr: 10.27.16.0/23
    egress: nat-0cce336cb2d205bbf
    id: subnet-16dc363d
    name: us-east-1b
    type: Private
    zone: us-east-1b
  - cidr: 10.27.22.0/23
    id: subnet-ce6ea597
    name: utility-us-east-1d
    type: Utility
    zone: us-east-1d
  - cidr: 10.27.18.0/23
    id: subnet-9e8129e9
    name: utility-us-east-1c
    type: Utility
    zone: us-east-1c
  - cidr: 10.27.24.0/23
    id: subnet-5fdc3674
    name: utility-us-east-1b
    type: Utility
    zone: us-east-1b
  topology:
    dns:
      type: Public
    masters: private
    nodes: private

The update command completes, but none of masters ever attach to the load balancer and the protokube command just says it can't ever find the api. event after running for over an hour.

When I let kops create the VPC's everything comes up correctly, but we want to use our existing vpc as that way we don't have to get IT setup in our vpn routing so we can access the cluster.

jwhitcraft on 9 Feb 2017

I found that if you just specify the vpc-id and subnet ranges and let kops create the subnets themselves, that everything works.

Also, load balancer attachment usually happens by tags, can you confirm yours are set properly? it should have cluster name and the api tag on it.

if you specify subnet IDs, kops creates nothing aroudn them, so you have to manually ensure your routing and security groups are properly setup.

hollowimage on 9 Feb 2017

So, I figured out what my problem was, I didn't have the dns-hostname enabled on my vpc,

Once I ran this

aws ec2 modify-vpc-attribute --vpc-id vpc-0000000 --enable-dns-hostname "{\"Value\":true}"

the hosts can online.

jwhitcraft on 9 Feb 2017

wow ... not going to read this novel ... can someone TLDR; on where we are at, and can we close.

chrislovecnm on 10 Feb 2017

👍2

Having the same problem as @kaspernissen (topology private / configured subnets by myself)

LoadBalancer Creation only works if I tag my own created subnets with "KubernetesCluster" and value "name.bla" - This is a dangerous workaround as kops will delete the subnets if I delete the cluster afterward.

Not sure how to solve this. How do you like the solution that we add a second Tag like "KopsIgnore" and if this tag is set with value yes those subnets won't be deleted. This tagging could help us for every "self created" resource. I could prepare a pull request for this (would try to do it over the weekend) + update the docs if this is an interesting solution.

thomaspeitz on 10 Feb 2017

@tsupertramp I would go for a single tag named kops with 2 possible values:

manage: kops manages the resource completely. It created the resource, can update it and also destroy it safely
update: it can only update the resource (e.g. routing table) but it didn't create it and is also not responsible for deleting it.

ringods on 22 Feb 2017

👍1

I like the tag approach. We use terraform to build our infrastructure, so it would be easy to add this tags to infrastructure shared with kops.

kha0S on 21 Mar 2017

Closing this as the original issue was fixed, and this has turned into an umbrella issue for any existing VPC/subnet issues.

If there are any outstanding concerns about this issue that need to be addressed, please leave a comment explaining why this should be re-opened. If there are problems related to existing VPC/subnet that do not fall under the same root cause, please open a new issue to discuss them.

yissachar on 21 Mar 2017

Kops: Can't create cluster using existing subnet

Most helpful comment

All 48 comments

Exist resources:

Steps:

Environment

Steps

Result

Related issues