1. What kops version are you running? The command kops version, will display
this information.
Version 1.12.0-alpha.1 (git-d44c7fed9)
2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.
1.11.7
3. What cloud provider are you using?
openstack
4. What commands did you run? What is the simplest way to reproduce this issue?
kops create cluster --cloud openstack \
--name sd-dev-k8s.zedev.net \
--zones nova \
--network-cidr 192.168.220.0/24 \
--master-count 3 \
--node-count 3 \
--master-size m1.medium \
--node-size m1.xlarge.mem \
--topology private \
--bastion \
--ssh-public-key ~/.ssh/id_rsa.pub \
--networking weave \
--os-ext-net ze-public1 \
--kubernetes-version 1.11.7 \
--image container-linux-1967.6.0
5. What happened after the commands executed?
First failure I encountered:
I0222 17:21:09.900518 7176 create_cluster.go:1456] Using SSH public key: /home/ubuntu/.ssh/id_rsa.pub
W0222 17:21:10.990238 7176 create_cluster.go:713] Running with masters in the same AZs; redundancy will be reduced
error populating configuration: error loading config file: open /home/ubuntu/.openstack/config: no such file or directory
After figuring out how to build this file by looking at the history of the openstack tutorial:
I0222 17:21:33.436162 7198 create_cluster.go:1456] Using SSH public key: /home/ubuntu/.ssh/id_rsa.pub
W0222 17:21:34.466253 7198 create_cluster.go:713] Running with masters in the same AZs; redundancy will be reduced
error populating configuration: error getting section of Designate: section 'Designate' does not exist
After adding the appropriate section for Designate:
I0222 19:16:35.615058 7833 create_cluster.go:1456] Using SSH public key: /home/ubuntu/.ssh/id_rsa.pub
W0222 19:16:36.651965 7833 create_cluster.go:713] Running with masters in the same AZs; redundancy will be reduced
I0222 19:16:37.694541 7833 subnets.go:184] Assigned CIDR 192.168.220.32/27 to subnet nova
I0222 19:16:37.694681 7833 subnets.go:198] Assigned CIDR 192.168.220.0/30 to subnet utility-nova
Previewing changes that will be made:
I0222 19:16:43.940036 7833 builder.go:297] error reading hash file "https://kubeupv2.s3.amazonaws.com/kops/1.12.0-alpha.1/linux/amd64/utils.tar.gz.sha1": unexpected response code "403 Forbidden" for "https://kubeupv2.s3.amazonaws.com/kops/1.12.0-alpha.1/linux/amd64/utils.tar.gz.sha1": <?xml version="1.0" encoding="UTF-8"?>
<Error><Code>AccessDenied</Code><Message>Access Denied</Message><RequestId>0688EECF03949ED6</RequestId><HostId>kltff0LzZZrQuots7KsjVGHzGbFi9KBYJD/KbWeZegHikjFlGkpQ2cBxuN/UD5lDPh4pQKR2mjI=</HostId></Error>
cannot determine hash for "https://kubeupv2.s3.amazonaws.com/kops/1.12.0-alpha.1/linux/amd64/utils.tar.gz" (have you specified a valid file location?)
After setting KOPS_BASE_URL to https://kubeupv2.s3.amazonaws.com/kops/1.11.0:
I0222 19:37:23.864263 7906 create_cluster.go:1456] Using SSH public key: /home/ubuntu/.ssh/id_rsa.pub
W0222 19:37:24.948779 7906 create_cluster.go:713] Running with masters in the same AZs; redundancy will be reduced
I0222 19:37:25.923374 7906 subnets.go:184] Assigned CIDR 192.168.220.32/27 to subnet nova
I0222 19:37:25.923516 7906 subnets.go:198] Assigned CIDR 192.168.220.0/30 to subnet utility-nova
Previewing changes that will be made:
W0222 19:37:31.655041 7906 urls.go:71] Using base url from KOPS_BASE_URL env var: "https://kubeupv2.s3.amazonaws.com/kops/1.11.0"
error building tasks: error reading manifest addons/dns-controller.addons.k8s.io/k8s-1.6.yaml: error opening resource: error executing resource template "addons/dns-controller.addons.k8s.io/k8s-1.6.yaml": error executing template "addons/dns-controller.addons.k8s.io/k8s-1.6.yaml": template: addons/dns-controller.addons.k8s.io/k8s-1.6.yaml:38:17: executing "addons/dns-controller.addons.k8s.io/k8s-1.6.yaml" at <DnsControllerArgv>: error calling DnsControllerArgv: unhandled cloudprovider "openstack"
6. What did you expect to happen?
Honestly, about this, it's alpha after all.
7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.
[https://gist.github.com/wfhartford/787e015c04f0c6f0b4d82b14097c92a6]
8. Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.
[https://gist.github.com/wfhartford/7405cca2e7507520491303fcdd4a1d36]
9. Anything else do we need to know?
Thanks so much for working to support openstack. Please let me know if there are any other things I should try, or if I can help at all.
/sig openstack
Between each attempt, I had to delete data from my swift container manually because the command:
kops delete cluster sd-dev-k8s.zedev.net -v 10
Fails with the following output:
I0222 21:39:48.214197 8021 factory.go:68] state store swift://kops-state-store
I0222 21:39:48.214414 8021 swiftfs.go:66] authenticating to keystone
I0222 21:39:48.742125 8021 swiftfs.go:418] Reading file "swift://kops-state-store/sd-dev-k8s.zedev.net/config"
I0222 21:39:49.126495 8021 cloud.go:311] authenticating to keystone
I0222 21:39:49.811824 8021 swiftfs.go:109] using openstack config found in /home/ubuntu/.openstack/config
I0222 21:39:49.812083 8021 cloud.go:402] Openstack using deprecated lbaasv2 api
panic: runtime error: index out of range
goroutine 1 [running]:
k8s.io/kops/pkg/resources/openstack.(*clusterDiscoveryOS).ListDNSRecordsets(0xc42119c8c0, 0x0, 0x0, 0x0, 0x0, 0x0)
/home/wesley/work/projects/go/src/k8s.io/kops/pkg/resources/openstack/dns.go:54 +0x816
k8s.io/kops/pkg/resources/openstack.(*clusterDiscoveryOS).ListDNSRecordsets-fm(0x0, 0x0, 0x0, 0x0, 0x0)
/home/wesley/work/projects/go/src/k8s.io/kops/pkg/resources/openstack/openstack.go:56 +0x2a
k8s.io/kops/pkg/resources/openstack.ListResources(0x3b33420, 0xc42119c7d0, 0x7ffe9529f554, 0x14, 0xc42119c7d0, 0xc420d77928, 0x42b244)
/home/wesley/work/projects/go/src/k8s.io/kops/pkg/resources/openstack/openstack.go:59 +0x3d1
k8s.io/kops/pkg/resources/ops.ListResources(0x7f2ba3239238, 0xc42119c7d0, 0x7ffe9529f554, 0x14, 0x0, 0x0, 0x0, 0x6, 0x1)
/home/wesley/work/projects/go/src/k8s.io/kops/pkg/resources/ops/collector.go:45 +0x46c
main.RunDeleteCluster(0xc4206bd680, 0x3ad6b80, 0xc42000e020, 0xc42034d2c0, 0x0, 0x0)
/home/wesley/work/projects/go/src/k8s.io/kops/cmd/kops/delete_cluster.go:134 +0x5ec
main.NewCmdDeleteCluster.func1(0xc420499900, 0xc42034dce0, 0x1, 0x3)
/home/wesley/work/projects/go/src/k8s.io/kops/cmd/kops/delete_cluster.go:79 +0xd6
k8s.io/kops/vendor/github.com/spf13/cobra.(*Command).execute(0xc420499900, 0xc42034dc50, 0x3, 0x3, 0xc420499900, 0xc42034dc50)
/home/wesley/work/projects/go/src/k8s.io/kops/vendor/github.com/spf13/cobra/command.go:760 +0x2c1
k8s.io/kops/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0x58783a0, 0x2d71100, 0x0, 0x0)
/home/wesley/work/projects/go/src/k8s.io/kops/vendor/github.com/spf13/cobra/command.go:846 +0x30a
k8s.io/kops/vendor/github.com/spf13/cobra.(*Command).Execute(0x58783a0, 0x58aeb90, 0x0)
/home/wesley/work/projects/go/src/k8s.io/kops/vendor/github.com/spf13/cobra/command.go:794 +0x2b
main.Execute()
/home/wesley/work/projects/go/src/k8s.io/kops/cmd/kops/root.go:97 +0x87
main.main()
/home/wesley/work/projects/go/src/k8s.io/kops/cmd/kops/main.go:25 +0x20
Hey @wfhartford,
I believe you might be the first to have a cluster with designate. I would be curious to know if you run into issue deploying with a name suffix .k8s.local?
Also we have decided to not support the use of the config file for now and have updated the tutorial found here:
https://github.com/kubernetes/kops/blob/master/docs/tutorial/openstack.md
This would cause other issues if you were to get around your current designate issue.
Short of that you've probably found a real issue with untested flows. I can see if I can get a devstack instance going with designate for testing.
I don't really care about designate, if I had realised that changing to k8s.local would eliminate that problem, I would have tried.
It looks like that config file is still referenced by the DNS code, I removed the config file and changed the name as you recommended and didn't need the config file:
Command:
kops create cluster --cloud openstack \
--name k8s.local \
--zones nova \
--network-cidr 192.168.220.0/24 \
--master-count 3 \
--node-count 3 \
--master-size m1.medium \
--node-size m1.xlarge.mem \
--topology private \
--bastion \
--ssh-public-key ~/.ssh/id_rsa.pub \
--networking weave \
--os-ext-net ze-public1 \
--kubernetes-version 1.11.7 \
--image container-linux-1967.6.0
Output:
I0222 22:26:30.957526 8258 create_cluster.go:1456] Using SSH public key: /home/ubuntu/.ssh/id_rsa.pub
W0222 22:26:32.016011 8258 create_cluster.go:713] Running with masters in the same AZs; redundancy will be reduced
I0222 22:26:33.064430 8258 subnets.go:184] Assigned CIDR 192.168.220.32/27 to subnet nova
I0222 22:26:33.064595 8258 subnets.go:198] Assigned CIDR 192.168.220.0/30 to subnet utility-nova
Previewing changes that will be made:
W0222 22:26:36.419263 8258 urls.go:71] Using base url from KOPS_BASE_URL env var: "https://kubeupv2.s3.amazonaws.com/kops/1.11.0"
I0222 22:26:37.607466 8258 apply_cluster.go:558] Gossip DNS: skipping DNS validation
error building tasks: must set ETCDMemberSpec.VolumeType on Openstack platform
After adding --etcd-storage-type rbd to the command, I think I got a successful execution.
Regarding the comment about the delete command, I guess I just wasn't using it correctly kops delete cluster --name sd-dev.k8s.local --yes does in fact delete the contents of by state store.
Scaling down my cluster to 1 master and 1 node with no bastion node, (our openstack cluster is very small) the kops create cluster command succeeded, followed by kops update cluster --name sd-dev.k8s.local --yes with no unexpected warnings.
However, none of the supplied commands (kubectl get nodes, kops validate cluster) succeeded. The IP address in the kubeconfig file is linked to a load balancer that was created. That load balancer seems to be configured correctly (it lists the internal IP of the master node), but connection attempts fail; curl -k https://<ip address> fails with a connection time out.
I added a static route to the router created by kops (that's required on all our routers in openstack, not really sure why) that allowed curl to connect, but it still fails with curl: (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to 172.20.57.11:443.
All kubectl commands fail with:
Unable to connect to the server: EOF
kops validate cluster fails with:
unexpected error during validation: error listing nodes: Get https://172.20.57.11/api/v1/nodes: EOF
Could you ssh to master and check journalctl also it seems that you have not builded nodeup yourself. These newest openstack changes does not exist in any release yet. Hopefully we will get 1.12 alpha out soon.
Thanks for the tip. I found a couple things, probably most fatally, name resolution wasn't working on my nodes, this was an issue with the networks created in openstack. The subnets were created without DNS servers, I'm not sure this is an issue with now kops created the networks, or how our openstack cluster is configured, but for any new subnet, we have to manually configure DNS servers. I'm leaning towards that being a misconfiguration of our openstack cluster and have contacted our administrator.
I also discovered an issue with the cloudinit script which I believe is created by kops. On line 33, the OS_PASSWORD variable is initialised. The literal password value used is not escaped or quoted at all, so the $ in my password caused an unbound variable error.
@wfhartford actually I have same problem with DNS all the time. I have been thinking should I add --os-dns-servers flag where people could specify dns servers as comma separated. I have 4 different openstack environments and none of them the dns does not proxy dns requests forward. It means that I need to define dnsservers to subnets like https://github.com/zetaab/kops/pull/1/files#diff-d7e0a10fd45a576a68991826f7da5edeR136
@drekle what you think, is that good to add? It seems that there are other people who have this problem as well. (I thought that I am only one 馃槃 )
@zetaab Interesting, good to know I'm not the only one.
@zetaab In your first comment, you mention building nodeup. I build kops from master, but did not do anything specific for nodeup. Would you mind giving be a quick pointer on how to do that? Does nodeup get built into the kops binary? After changing my openstack password to contain no special characters, deleting, rebuilding, adding my static rounte and DNS entires, the cloudinit script ran without error on the master and the node, but I get the same errors as before (EOF from kops and kubectl, SSL_ERROR_SYSCALL from curl). Looking at journalctl on the master I see some warnings from nodeup:
W0225 17:47:15.186233 770 main.go:142] got error running nodeup (will retry in 30s): cannot parse ConfigBase "swift://kops-state-store/sd-dev.k8s.local": can not find home directory
@drekle what you think, is that good to add? It seems that there are other people who have this problem as well. (I thought that I am only one 馃槃 )
I'd support this. I am only lucky that they are baked in for me.
Would you mind giving be a quick pointer on how to do that?
@wfhartford
I have a VM hosting my binaries with an A record pointing to the floating IP drekle.mydomain.com
On this VM I check out and build kops in the following way:
make dep-ensure version-dist
This will place artifacts which I can pull in the .build directory.
You can run a webserver here. python -m SimpleHTTPServer 80 python -m http.server 80
You can then export the following environment to tell kops to use your built binaries. Note that this location must be accessible from the openstack VM's provisioned:
"PROTOKUBE_IMAGE": "http://drekle.mydomain.com/.build/dist/images/protokube.tar.gz",
"KOPS_BASE_URL": "http://drekle.mydomain.com/.build/dist/",
"NODEUP_URL": "http://drekle.mydomain.com/.build/dist/nodeup",
The example above assumes that you're hosting at the kops root directory. You could also serve at .build/dist and update your environment variables appropriately.
@zetaab is this useful to document? I assume you are doing something similar?
I am using s3 to share this .build folder. However, any http server can be used which can just serve files and is available by openstack instances.
So my process is something like
make version-dist
(here i am copying nodeup binary from place a to place b. For some reason the nodeup is not in correct place for me.)
s3cmd sync .build s3://kopstest/
s3cmd setacl -P --recursive s3://kopstest/.build/
export PROTOKUBE_IMAGE=https://kopstest.s3foobar.com/.build/dist/images/protokube.tar.gz
export KOPS_BASE_URL=https://kopstest.s3foobar.com/.build/dist/
NOTE! this whole process is not needed after kops has 1.12-alpha out.
@wfhartford https://github.com/kubernetes/kops/pull/6530 this could maybe solve some of your dns problems
Thanks so much for the help both of you. I built everything and hosted it more or less as @drekle suggested, rebuilt my cluster and now have a working kubernetes cluster! I'll summarise the issues I faced that I think need attention in the kops support for openstack:
@wfhartford, where did you find the documentation about configuring .openstack/config? Can you share the link?
Some details of the configuration file were removed from the main OpenStack tutorial page by this commit: https://github.com/kubernetes/kops/commit/be034cf79fee644bb4c53616ee790fb92b4b9b4d#diff-2831416e6054eb7e98980240a6c3fe05. The file was only needed for me when I was trying to build the cluster with a real domain name in the --name parameter. That causes kops to try to integrate with OpenStack Designate, which as far as I know is just broken. Once I changed to a --name parameter ending in .local kops no longer wanted to integrate with Designate, and didn't look for the config file any more.
Most helpful comment
@wfhartford, where did you find the documentation about configuring .openstack/config? Can you share the link?