Kubespray: Unable to bring up LoadBalancer on AWS

Created on 3 May 2019  路  17Comments  路  Source: kubernetes-sigs/kubespray

Once cluster is up and running, kubectl apply the following:

apiVersion: v1
kind: Service
metadata:
  name: httpbin
spec:
  selector:
    app: httpbin
  type: LoadBalancer
  ports:
  - name: http
    port: 80
    targetPort: 80

The result is:

NAME                 TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
service/httpbin      LoadBalancer   10.233.34.215   <pending>     80:31118/TCP   18m
service/kubernetes   ClusterIP      10.233.0.1      <none>        443/TCP        22h

The LoadBalancer stays in Pending. With Kops the LoadBalancer is provisioned. I copied the policy on the master role:

    {
      "Effect": "Allow",
      "Action": ["elasticloadbalancing:*"],
      "Resource": ["*"]
    },

to the worker role policy. Any ideas why this is not working?

Environment:
AWS via terraform contrib then ansible cluster.yml

  • OS (printf "$(uname -srm)\n$(cat /etc/os-release)\n"):
    Linux 4.19.34-coreos x86_64
NAME="Container Linux by CoreOS"
ID=coreos
VERSION=2079.3.0
VERSION_ID=2079.3.0
BUILD_ID=2019-04-22-2119
PRETTY_NAME="Container Linux by CoreOS 2079.3.0 (Rhyolite)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
COREOS_BOARD="amd64-usr"
  • Version of Ansible : ansible 2.7.10

Kubespray version (commit) (git rev-parse --short HEAD):
00369303

Network plugin used:
calico

Copy of your inventory file:

k8s-xxx-master0 ansible_host=10.134.8.97
k8s-xxx-master1 ansible_host=10.134.9.162
k8s-xxx-master2 ansible_host=10.134.10.217
k8s-xxx-worker0 ansible_host=10.134.8.13
k8s-xxx-worker1 ansible_host=10.134.9.25
k8s-xxx-worker2 ansible_host=10.134.10.130
k8s-xxx-etcd0 ansible_host=10.134.8.241
k8s-xxx-etcd1 ansible_host=10.134.9.159
k8s-xxx-etcd2 ansible_host=10.134.10.20
bastion ansible_host=xx.xx.xx.xx

[bastion]
bastion ansible_host=xx.xx.xx.xx ansible_user=core

[kube-master]
k8s-xx-master0
k8s-xx-master1
k8s-xx-master2


[kube-node]
k8s-xx-worker0
k8s-xx-worker1
k8s-xx-worker2


[etcd]
k8s-xxx-etcd0
k8s-xxx-etcd1
k8s-xxx-etcd2


[k8s-cluster:children]
kube-node
kube-master


[k8s-cluster:vars]
apiserver_loadbalancer_domain_name="elb-k8s-xxx-xxxxxxxxx.us-west-2.elb.amazonaws.com"

Command used to invoke ansible:
ansible-playbook -i inventory/xxx/hosts.yml --become --become-user=root cluster.yml

kinbug lifecyclrotten

All 17 comments

I'm having the same issue. We might need to set cloud_provider to aws in all.yml but if I do that then my cluster does not get created.

@habbas99 Exactly the same here. The only way I could get it to work is to bring it up without cloud_provider: aws in all.yml, and then log into each master and add the - --cloud_provider: aws flag to two files under /etc/kubernetes/manifests/ and then restart two services...

@mabushey Which services did you restart after adding the cloud provider flag in /etc/kubernetes/manifests/ for each master node?

I had an issue with manually adding the could_provider aws line and restarting the services...

From: https://blog.scottlowe.org/2018/09/28/setting-up-the-kubernetes-aws-cloud-provider/:

You must have the --cloud-provider=aws flag added to the Kubelet before adding the node to the cluster. Key to the AWS integration is a particular field on the Node object鈥攖he .spec.providerID field鈥攁nd that field will only get populated if the flag is present when the node is added to the cluster. If you add a node to the cluster and then add the command-line flag afterward, this field/value won鈥檛 get populated and the integration won鈥檛 work as expected. No error is surfaced in this situation (at least, not that I鈥檝e been able to find).

There's 2.5 thousand forks on this project because they don't accept pull requests (At least they won't take mine because I run my own mail server). Most likely someone has already fixed this... Also I've tried both master and release-2.10 branches.

@habbas99 To specifically answer your question, I added - --cloud-provider=aws to /etc/kubernetes/manifests/kube-controller-manager.yaml and /etc/kubernetes/manifests/kube-apiserver.yaml and then did systemctl daemon-reload and systemctl restart kubelet

@mabushey Thanks! Now the ELB gets created in AWS but I'm not able to reach the paths that I defined in my ingress rules. When I try to hit the load balancer I get a 503. The only thing I noticed was that the workers are not added as instances to the ELB. Do I need to add the cloud-provider flag to the worker nodes as well?

@habbas99 - Read my previous comment (three up). The .spec.providerID field does not get set by adding the cloud provider later.

@mabushey I'm wondering if there is a work around. I might try adding the worker instances to the created ELB but I'm not sure if it would make a difference.

Is Kubespray a dead project or just dead on AWS?

@mabushey Kubespray is indeed not dead at all.

You can try to join our Slack channel and ask for more help. This community is based on people helping out and using their time for free and thereby sometimes, some issues will not get that much attention. I think the main focus for people using Kubespray is on Bare-Metal and not AWS which would be why the AWS issues do not get as much attention.

Can you try the latest master branch, and also in the issue include every step you do to set up your cluster.

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

this PR fixes this bug https://github.com/kubernetes-sigs/kubespray/pull/4338
after this you can set cloud_provider: "aws"

also related to https://github.com/kubernetes-sigs/kubespray/issues/5139

/remove-lifecycle rotten

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Was this page helpful?
0 / 5 - 0 ratings