We've been using terraform from within our CI for a year or two, using device code (since our CI cannot open a browser window).
az login --use-device-code
az account set --subscription "${SUBSCRIPTION}"
# example of working az command
az resource tag --tags gitlabPipelineId=${CI_PIPELINE_ID} --resource-group "AA" ...
terraform init -input=false
TF_LOG=DEBUG OCI_GO_SDK_DEBUG=v terraform plan -input=false
(We're not using Service Principals in the apply stage, instead relying on the user authentication of the DevOps engineer who takes ownership for the change)
Since a month or so, this (unchanged) setup is broken - Error building AzureRM Client. We upgraded to the latest terraform and latest 1.x azurerm provider, to no avail. This happens on multiple subscriptions (which are corporately managed).
az commands definitely work in our CI (incl write operations) - so we are definitely logged in properly using device code and there is successful communication with the Azure API.
But terraform fails to use those same credentials properly now, for some new reason.
$ az login --use-device-code
WARNING: To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code AAAAAAAAA to authenticate.
[
{
"cloudName": "AzureCloud",
"id": "a-x-x-x-x",
"isDefault": true,
"name": "MyCompany",
"state": "Enabled",
"tenantId": "aa-x-x-x-x",
"user": {
"name": "[email protected]",
"type": "user"
}
},
Additional testing shows that the exact same terraform config works in Azure's cloud shell. Same user, same terraform versions, same config, and same remote backend (terraform state). The only difference I see is our CI's location (behind a proxy) and its use of device login.
Or, maybe, the Azure API behaves differently, and introduced a breaking change for this combination of terraform and device login.
I obviously lack additional debug output from a successful run in our CI for comparison (i.e. from a month ago).
Comparing debug logs, the main differences I see between success and failure is the missing line Using Managed Service Identity for Authentication in the CI run, which is present in the successful Azure cloud shell run.
Following that missing line, it seems terraform assumes (and fails) to use a service principal.
How does terraform poll what authentication method(s) are available?
# This is a minimal, slightly obfuscated, config - but I've tested that this also fails in the same way described as our full config
terraform {
backend "azurerm" {
storage_account_name = "terraformstateAA"
container_name = "tfstate"
key = "terraform.tfstate"
access_key = "aaa"
}
}
provider "azurerm" {
version = "=1.44.0"
skip_provider_registration = "true"
subscription_id = "aa-x-x-x-x"
}
provider "template" {
version = "~> 2.1"
}
terraform {
required_version = ">= 0.12"
}
resource "azurerm_application_security_group" "K8sCompute" {
name = "AAK8sQualV2Compute"
location = "North Europe"
resource_group_name = "RG_AA"
tags = {
}
}
Working Azure Shell terraform run
2020-06-09T16:56:17.350Z [WARN] plugin.stdio: received EOF, stopping recv loop: err="rpc error: code = Unimplemented desc = unknown service plugin.GRPCStdio"
2020/06/09 16:56:17 [TRACE] GRPCProvider: Configure
2020-06-09T16:56:17.390Z [DEBUG] plugin.terraform-provider-azurerm_v1.35.0_x4: Testing if Service Principal / Client Certificate is applicable for Authentication..
2020-06-09T16:56:17.390Z [DEBUG] plugin.terraform-provider-azurerm_v1.35.0_x4: Testing if Multi Tenant Service Principal / Client Secret is applicable for Authentication..
2020-06-09T16:56:17.390Z [DEBUG] plugin.terraform-provider-azurerm_v1.35.0_x4: Testing if Service Principal / Client Secret is applicable for Authentication..
2020-06-09T16:56:17.390Z [DEBUG] plugin.terraform-provider-azurerm_v1.35.0_x4: Testing if Managed Service Identity is applicable for Authentication..
2020-06-09T16:56:17.390Z [DEBUG] plugin.terraform-provider-azurerm_v1.35.0_x4: Using Managed Service Identity for Authentication
2020-06-09T16:56:17.390Z [DEBUG] plugin.terraform-provider-azurerm_v1.35.0_x4: [DEBUG] Using MSI msiEndpoint "http://localhost:50342/oauth2/token"
2020-06-09T16:56:17.390Z [DEBUG] plugin.terraform-provider-azurerm_v1.35.0_x4: Getting OAuth config for endpoint https://login.microsoftonline.com/ with tenant aaaaaaaa-bbbb-4444-dddd-777777777777
2020-06-09T16:56:17.390Z [DEBUG] plugin.terraform-provider-azurerm_v1.35.0_x4: [DEBUG] getAuthorizationToken with MSI msiEndpoint "http://localhost:50342/oauth2/token", ClientID "" for msiEndpoint "https://management.azure.com/"
2020-06-09T16:56:17.390Z [DEBUG] plugin.terraform-provider-azurerm_v1.35.0_x4: [DEBUG] getAuthorizationToken with MSI msiEndpoint "http://localhost:50342/oauth2/token", ClientID "" for msiEndpoint "https://graph.windows.net/"
Broken CI terraform run:
2020-06-09T18:53:13.599+0200 [WARN] plugin.stdio: received EOF, stopping recv loop: err="rpc error: code = Unimplemented desc = unknown service plugin.GRPCStdio"
2020/06/09 18:53:13 [TRACE] GRPCProvider: Configure
2020-06-09T18:53:13.654+0200 [DEBUG] plugin.terraform-provider-azurerm_v1.35.0_x4: Testing if Service Principal / Client Certificate is applicable for Authentication..
2020-06-09T18:53:13.654+0200 [DEBUG] plugin.terraform-provider-azurerm_v1.35.0_x4: Testing if Multi Tenant Service Principal / Client Secret is applicable for Authentication..
2020-06-09T18:53:13.654+0200 [DEBUG] plugin.terraform-provider-azurerm_v1.35.0_x4: Testing if Service Principal / Client Secret is applicable for Authentication..
2020-06-09T18:53:13.654+0200 [DEBUG] plugin.terraform-provider-azurerm_v1.35.0_x4: Using Service Principal / Client Secret for Authentication
2020/06/09 18:53:13 [ERROR] <root>: eval: *terraform.EvalConfigProvider, err: Error building AzureRM Client: 2 errors occurred:
* A Client ID must be configured when authenticating as a Service Principal using a Client Secret.
* A Tenant ID must be configured when authenticating as a Service Principal using a Client Secret.
2020/06/09 18:53:13 [ERROR] <root>: eval: *terraform.EvalSequence, err: Error building AzureRM Client: 2 errors occurred:
* A Client ID must be configured when authenticating as a Service Principal using a Client Secret.
* A Tenant ID must be configured when authenticating as a Service Principal using a Client Secret.
Terraform should use the saved az authentication.
Terraform could not authenticate.
az login --use-device-codeterraform planJust upgraded the azure-cli package - I had missed that difference - it was 2.0.80 - now 2.7.0
Unfortunately, this does not help.
Anything I can do to check the accessTokens.json? I assume that's what the terraform provider uses somehow?
I'm thinking this could be similar to:
We do have multiple subscriptions.
And I believe I may have recently been added to a new subscription, presumably with the same tenant, which may have triggered this bug?
On the other hand, terraform in the Azure Cloud Shell doesn't have a problem with all this.
I think I finally found the problem.
A colleague had set ARM_CLIENT_SECRET in our CI, e.g. stored outside our codebase (in GitLab CI project settings), when he was working with Service Principals in another branch.
Apparently our production branch was inferring credential mechanisms based on that, and failing.
Close? Or can the terraform output be improved to inform us that it has found this variable and intends to use it?
hey @degerrit
Thanks for opening this issue.
When running in CloudShell Terraform uses MSI rather than the Azure CLI for authentication - which is why this behaviour is different. Terraform has a list of supported authentication methods which get tried in turn (if enabled) (shared by both the Azure Backend and the Azure Provider) - as such we'll work down the list working through when whilst we find one.
In this instance, since we're logging Using Service Principal / Client Secret for Authentication it appears that a Client Secret is being specified (since this is the criteria for using this auth method) - so I'd recommend double-checking the Environment Variables being used here (since from the Terraform Configuration it appears that nothing's being passed inline).
There's been some changes to the way that the Azure CLI authentication works throughout the 1.x lifecycle, where we've gone from parsing the accessTokens.json file to shelling out to the Azure CLI instead; whilst this behaviour will change in an upcoming release of 2.x, this doesn't appear to be the root-cause since it appears that a Client Secret is being specified.
It's worth noting that the behaviour of the Azure CLI is unpredictable when being run in a headless environment (for example, we frequently see spurious output from the Azure CLI) - whilst this may work as expected when fully configured (for example, configuring any preferences such as data collection for the Azure CLI prior to running Terraform), since this behaviour can change under-our-feet unfortunately this isn't something we officially support when running in an automated environment (although I can understand your use-case).
Since this should be fixed by removing the Client Secret being used here - I'm going to close this issue for the moment - but should you have further questions I believe you should be able to get an answer for this using one of the Community Resources.
Thanks!
I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.
If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!
Most helpful comment
hey @degerrit
Thanks for opening this issue.
When running in CloudShell Terraform uses MSI rather than the Azure CLI for authentication - which is why this behaviour is different. Terraform has a list of supported authentication methods which get tried in turn (if enabled) (shared by both the Azure Backend and the Azure Provider) - as such we'll work down the list working through when whilst we find one.
In this instance, since we're logging
Using Service Principal / Client Secret for Authenticationit appears that a Client Secret is being specified (since this is the criteria for using this auth method) - so I'd recommend double-checking the Environment Variables being used here (since from the Terraform Configuration it appears that nothing's being passed inline).There's been some changes to the way that the Azure CLI authentication works throughout the 1.x lifecycle, where we've gone from parsing the
accessTokens.jsonfile to shelling out to the Azure CLI instead; whilst this behaviour will change in an upcoming release of 2.x, this doesn't appear to be the root-cause since it appears that a Client Secret is being specified.It's worth noting that the behaviour of the Azure CLI is unpredictable when being run in a headless environment (for example, we frequently see spurious output from the Azure CLI) - whilst this may work as expected when fully configured (for example, configuring any preferences such as data collection for the Azure CLI prior to running Terraform), since this behaviour can change under-our-feet unfortunately this isn't something we officially support when running in an automated environment (although I can understand your use-case).
Since this should be fixed by removing the Client Secret being used here - I'm going to close this issue for the moment - but should you have further questions I believe you should be able to get an answer for this using one of the Community Resources.
Thanks!