Terraform-aws-eks: Cluster Autoscaler with IRSA

Created on 17 Mar 2020  路  8Comments  路  Source: terraform-aws-modules/terraform-aws-eks

Hi,

Having problems with enabling Cluster autoscaler using IRSA with the module

I'm submitting a...

  • [*] support request - read the FAQ first!

What is the current behavior?

Running example here - https://github.com/terraform-aws-modules/terraform-aws-eks/tree/master/examples/irsa

But seen error
E0317 11:46:48.531388 1 aws_manager.go:261] Failed to regenerate ASG cache: cannot autodiscover ASGs: WebIdentityErr: failed to retrieve credentials caused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity status code: 403, request id: ccab18b4-4786-49a4-9168-d63c2dadccd4

F0317 11:46:48.531413 1 aws_cloud_provider.go:376] Failed to create AWS Manager: cannot autodiscover ASGs: WebIdentityErr: failed to retrieve credentials caused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity

Not sure what I missed out

What's the expected behavior?

Cluster autoscaler to work with existing cluster and ASG

Module version 10.0
EKS 1.15

Most helpful comment

@djesionek I had a similar issue when using the new chart. It appears the serviceAccount name changed and my OIDC was expecting the old name. Setting it back using rbac.serviceAccount.name worked for me.

All 8 comments

Installed CA in wrong namespace

Hi @acarsercan, could you share a bit more information about which was the right namespace at the end? I get the same issue deploying the CA to the kube-system namespace

I hit the same error although the namespace is correct,not sure what is causing it

The namespace is used in the assume role policy. So if the namespace in the policy does not match the namespace of the service account then it won't work.

https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/examples/irsa/irsa.tf#L8

I deployed the autoscaler into kube-system which is also the value for local.k8s_service_account_namespace as in the example. It worked for me using the (deprecated) chart cluster-autoscaler from https://kubernetes-charts.storage.googleapis.com apparently. Now using the autoscaler repo https://kubernetes.github.io/autoscaler with the chart cluster-autoscaler-chart i get this error.

I noticed that the values format changed a bit since so I adjusted the only part I saw was different from the example shown here:

rbac:
  create: true
  serviceAccount:
    annotations:
      eks.amazonaws.com/role-arn: "arn:aws:iam::<AWS_ACCOUNT_ID>:role/cluster-autoscaler"

But that didn't change much unfortunately, so any other ideas on what might be wrong?

@acarsercan have you another solution that works for you now since this ticket wasn't followed up on by you?

As per my reply earlier, I have not found a solution

so any other ideas on what might be wrong?

Then you need to work out what the difference between the 2 charts is.

  1. The role assume role policy must match the serviceAccount create by the chart
  2. The role ARN annotation on the serviceAccount has to match a real role
  3. The OIDC details must be correct
  4. The app inside the container must be using a recent version of the AWS SDK so it supports the new "assume role with web identity" flow

That's about all I can think of 馃檪

@djesionek I had a similar issue when using the new chart. It appears the serviceAccount name changed and my OIDC was expecting the old name. Setting it back using rbac.serviceAccount.name worked for me.

Was this page helpful?
0 / 5 - 0 ratings