Terraform-provider-azurerm: Error creating Managed Kubernetes Cluster: ServicePrincipalNotFound

Created on 12 Feb 2020 · 7Comments · Source: terraform-providers/terraform-provider-azurerm

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request

Terraform (and AzureRM Provider) Version

Terraform Core version: 0.12.16
AzureRM Provider version: 1.41.0
AzureAD Provider version: 0.7.0

Terraform Configuration Files

resource "azuread_application" "auth" {
  name            = var.service_principal_name
  identifier_uris = [var.service_principal_name]
}

resource "azuread_service_principal" "auth" {
  application_id = azuread_application.auth.application_id
}

resource "azuread_service_principal_password" "auth" {
  service_principal_id = azuread_service_principal.auth.id
  value                = var.service_principal_secret
  end_date_relative    = "43800h" # 5 years

  # XXX: wait for server replication before attempting role assignment creation
  provisioner "local-exec" {
    command = "sleep 10"
  }
}

resource "azurerm_role_assignment" "auth" {
  scope                = data.azurerm_subscription.current.id
  role_definition_name = "Contributor"
  principal_id         = azuread_service_principal.auth.id
}

resource "azurerm_kubernetes_cluster" "aks" {
  name                = var.cluster
  location            = azurerm_resource_group.rg.location
  kubernetes_version  = "1.15.7"
  resource_group_name = azurerm_resource_group.rg.name
  node_resource_group = var.resource_group_aks_nodes
  dns_prefix          = var.deployment_prefix

  default_node_pool {
    name       = "common"
    enable_auto_scaling = true
    min_count = 1
    max_count = 100
    vm_size    = var.common_nodes.type
    max_pods   = 80
    os_disk_size_gb = var.common_nodes.disk_size_gb
  }

  service_principal {
    client_id     = azuread_service_principal.auth.application_id
    client_secret = azuread_service_principal_password.auth.value
  }
}

Description / Feedback

Hello,

We are having an issue when trying to create an AKS cluster with terraform. We create with provider AzureAD a service principal, with a password. Then, we create the kubernetes cluster with these credentials, and often we have this error:

Error: Error creating Managed Kubernetes Cluster "example" (Resource Group "example-rg"): containerservice.ManagedClustersClient#CreateOrUpdate: Failure sending request: StatusCode=400 -- Original Error: Code="ServicePrincipalNotFound" Message="Service principal clientID: XXX not found in Active Directory tenant XXX, Please see https://aka.ms/aks-sp-help for more details."
  on aks.tf line 1, in resource "azurerm_kubernetes_cluster" "aks":
   1: resource "azurerm_kubernetes_cluster" "aks" {

We have try to workaround this error using a provisioner with an sleep, but it's a really bad solution.

References

We have seen similar issues, but they are marked as closed:

Also, We have seen a related issue on AKS:

https://github.com/Azure/AKS/issues/1206

bug servickubernetes-cluster upstream-microsoft

Source

arodriguezdlc

👍15

Most helpful comment

I think this workaround CAN NOT be a solution. It's ok to survive until the solution, but not acceptable for an official terraform provider for a first class cloud.

arodriguezdlc on 13 Feb 2020

👍3

All 7 comments

that's something we have been hitting since the very beginning. your workaround is exactly what I'm using, but I have it at 60 seconds sleep :( . it's not elegant, but it works

mariojacobo on 13 Feb 2020

I think this workaround CAN NOT be a solution. It's ok to survive until the solution, but not acceptable for an official terraform provider for a first class cloud.

arodriguezdlc on 13 Feb 2020

👍3

@arodriguezdlc, unfortunately while we have added some wait's to azuread its not enough as the API calls return before the new SP is replicated across data centres. There isn't much we can do on our end and MSFT will need to fix things on their side/

katbyte on 30 Mar 2020

Ok, we have reported this issue to Microsoft too. If we have any news, we will share them here. For now, we continue with the sleep workaround.

Thank you!!

arodriguezdlc on 31 Mar 2020

👋

AKS is gradually moving away from using a Service Principal defined in-line to using a Managed Identity - as such over time the service_principal block will become unsupported.

Since support for Service Principals is being superseded by Managed Identities - I'd suggest looking towards using the new Managed Identity-only clusters which will be supported in v2.5 of the Azure Provider, which do not have this requirement.

As support for Managed Identity only AKS Clusters will be available in v2.5 of the Azure Provider I'm going to assign this issue to that Milestone so that @hashibot can comment when this release is available, and then close this issue for the moment.

Thanks!

tombuildsstuff on 7 Apr 2020

This has been released in version 2.5.0 of the provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading. As an example:

provider "azurerm" {
    version = "~> 2.5.0"
}
# ... other configuration ...

hashibot[bot] on 9 Apr 2020

I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

hashibot[bot] on 7 May 2020

Was this page helpful?

0 / 5 - 0 ratings