Rke: failed to run Kubelet: could not init cloud provider "aws": AWS cloud failed to find ClusterID

Created on 20 May 2019  路  6Comments  路  Source: rancher/rke

Most helpful comment

@kumarselva007 I met the same issue today when I am starting an RKE cluster for installing Rancher in HA mode. The RKE document says I need to tag my instances in AWS: https://rancher.com/docs/rke/latest/en/config-options/cloud-providers/aws/#iam-requirements

screenshot

The issue is sovled after I adding the tag. Hope this will work for you, too.

All 6 comments

@kumarselva007 I met the same issue today when I am starting an RKE cluster for installing Rancher in HA mode. The RKE document says I need to tag my instances in AWS: https://rancher.com/docs/rke/latest/en/config-options/cloud-providers/aws/#iam-requirements

screenshot

The issue is sovled after I adding the tag. Hope this will work for you, too.

Let us know if that doesn't solve your issue.

all you need to do is adding the TAG in your EC2 instance or Launch Template/Configuration

kubernetes.io/cluster/ : owned

Problem Solved :100:

I have the same problem, the tags have been correctly set by Rancher but it still doesn't work.

[controlPlane] Failed to upgrade Control Plane: [[[controlplane] Error getting node controlpane1: "controlpane1" not found]]
--

I am having this issue as well, it occurred during an upgrade of the Kubernetes version of the cluster. SSHing into the failing node and tailing the kubelet log revealed this error message (indentation mine, for readability):

server.go:274] failed to run Kubelet: could not init cloud provider "aws": 
error finding instance i-0498a9ba1d28ead09: 
    "error listing AWS instances: 
        "AuthFailure: AWS was not able to validate the provided access credentials
    status code: 401,
    request id: 174ee5db-dfcc-4b1d-ace0-bd166d8cd401\""

What I find curious about this is that this cluster also fails to _read_ S3 snapshots it creates. It creates them just fine, I can see the etcd snapshot in S3. But the UI shows that authorization failed when trying to retrieve the snapshot information. I suspect the errors are related.

I am going to reboot the node and see how it goes. Hopefully it doesn't hose the kube upgrade.

Update: Rebooting the nodes in the cluster seems to have worked. Of course I went on to hose the cluster anyway, but that was an unrelated error.

I am having the same issue with v2.5.5. I have added the tags on all the resources used by the cluster

Was this page helpful?
0 / 5 - 0 ratings