Azure-cli: `az vm run-command invoke` returns Deployment Failure with message "Finished executing command"

Created on 26 Jul 2018  路  18Comments  路  Source: Azure/azure-cli

Describe the bug
az vm run-command invoke -g $RSGROUP -n $VMNAME --command-id EnableRemotePS returns error message:

Deployment failed. Correlation ID: 5ba9e28a-b1ae-4793-a2e3-0f48babe00c4. VM has reported a failure when processing extension 'RunCommandWindows'. Error message: "Finished executing command".

To Reproduce

  1. Create new VM.
  2. Attempt to enable PS remoting using Azure CLI az vm run-command invoke -g $RSGROUP -n $VMNAME --command-id EnableRemotePS
  3. Observe error.

Expected behavior
Either:

A) No erroneous deployment failure message
B) (If actually failed) Better error message describing actual error.

Environment summary
Azure Shell: Bash

azure-cli (2.0.42)

acr (2.1.0)
acs (2.2.0)
advisor (0.6.0)
ams (0.2.0)
appservice (0.2.0)
backup (1.2.0)
batch (3.3.0)
batchai (0.4.0)
billing (0.2.0)
cdn (0.1.0)
cloud (2.1.0)
cognitiveservices (0.2.0)
command-modules-nspkg (2.0.2)
configure (2.0.18)
consumption (0.4.0)
container (0.3.0)
core (2.0.42)
cosmosdb (0.2.0)
dla (0.2.0)
dls (0.1.0)
dms (0.1.0)
eventgrid (0.2.0)
eventhubs (0.2.0)
extension (0.2.0)
feedback (2.1.4)
find (0.2.12)
interactive (0.3.26)
iot (0.2.0)
keyvault (2.2.0)
lab (0.1.0)
maps (0.3.0)
monitor (0.2.0)
network (2.2.0)
nspkg (3.0.3)
policyinsights (0.1.0)
profile (2.1.0)
rdbms (0.3.0)
redis (0.3.0)
reservations (0.3.0)
resource (2.1.0)
role (2.1.0)
servicebus (0.2.0)
servicefabric (0.1.0)
sql (2.1.0)
storage (2.1.0)
vm (2.1.0)

Python location '/opt/az/bin/python3'
Extensions directory '/home/user/.azure/cliextensions'

Python (Linux) 3.6.1 (default, Jul 13 2018, 23:21:56)
[GCC 5.4.0 20160609]
Compute-cli Service Attention bug

All 18 comments

There are nothing CLI can do really if server doesn't return relevant information

You're not wrong - running similar command as powershell script (e.g. Enable-PSRemoting -Force) returns same, indicating Windows 10 has possible bug in that command.

Any idea who owns --command-id EnableRemotePS? I don't believe its Azure-CLI folks. Really should be bug for Windows 10 team, but probably get more traction by bugging whomever owns that command template.

Maybe @koralski knows it?

Cleaning up old issues from 2016 - 2018. Please reopen this issue if it is still a concern.

Hi,

I'm running the azure-cli (2.0.74) on a MacOs (10.14.6) and I'm getting the same error.

This is how I run:

az vm run-command invoke \
        --command-id RunPowerShellScript \
        --name $VM_NAME \
        -g $RESOURCE_GROUP  \
        --scripts @myscript.ps1 \
        --parameters "arg1=$STORAGE_ACCOUNT_NAME" "arg2=$STORAGE_KEY"

Deployment failed. Correlation ID: 3f4d9971-8e39-4223-b9c0-3e2c146b5748. VM has reported a failure when processing extension 'RunCommandWindows'. Error message: "Finished executing command".

I tested my myscript.ps1 directly in the VM and it works fine.

``` myscript.ps1
param(
[string]$storageAccountName,
[string]$key,
[string]$fileShareName
)

Invoke-Expression -Command "cmdkey /add:$storageAccountName.file.core.windows.net /user:Azure\$storageAccountName /pass:$key"

New-PSDrive -Persist -Name "Z" -PSProvider "FileSystem" -Root "\$storageAccountName.file.core.windows.net\$fileShareName" -Scope "Global"
```

Any ideas?

Hi @tonholis, let me reopen this older issue and have someone look into it.

When you test this command on the VM do you run it under a local admin account or local system? I believe that Runcommand may be using local system, and if memory serves me correctly, there are problems mapping a drive letter using local system. Let me check with some other folks on the team and get back to you.

Hi @Drewm3 ,

I ran the PS script under the local admin account.

RunCommand script is always executed under local system account,
Next version (early 2020) of run command will support execution under s specified user/password.
If you currently need execution under different account maybe you can create a script that can take user/password as parameters and spawn another process that run under the account.

@tonholis, I believe the problem is likely the fact that the script is running under local system. If you would like to test the script directly to confirm/deny this is in fact the problem you can follow the instructions on this site to start an interactive cmd prompt under local system to test the script: https://www.gabsoftware.com/tips/run-cmd-or-any-process-as-system-account-on-windows/.

@tonholis, I am going to close this issue based on the fact that this is likely an issue with the script running under local system. If you find the problem is something other than running under local system, then we can reopen the issue.

@Drewm3
I don't think you have adequately addressed this issue. I have encountered the same behavior as the original reporter. This isn't a matter of getting the script to run, but instead a problem with retrieving error stream information from the script sent to the VM. The issue from what I can tell is that the script on the VM is not returning the stderr / stdout data back to the machine that invoked the az run-command. I don't think this has anything to do with the user the account runs under. _Perhaps_ the azure agent could be involved here in terms of permissions, but the key point is that the script neither aborts as expected nor pipes back the error when there are issues, despite the output claiming an error occurred.

All you have to do to reproduce this is send an command that has an error in the myscript.ps1 body and attempt to troubleshoot it as you normally would troubleshoot any other script. So far as I can tell, the only way to determine what the error was is to go here on the VM:
C:\Packages\Plugins\Microsoft.CPlat.Core.RunCommandWindows\1.1.2\Status

I would recommend trying to push a script with errors and troubleshoot that without logging onto the VM. If you are able to do that in a manner that is straightforward, please pass that information along. Otherwise, as far as I can tell this needs improvement.

Looks like the output below is what is desired. I got this to print once, but can't get it to produce this type of output despite several runs of the identical changeset on the exact same agent. Note that this still produces a success when it was not successful, but at least I can parse the general output:

Desired (note that this is JSON with inline log messages, github messes up the formatting. Attached an image as well for legibility):

error_well_formatted

2019-11-06T18:56:15.1783429Z RunPowerShellScript 2019-11-06T18:57:48.6590092Z { 2019-11-06T18:57:48.6590883Z "value": [ 2019-11-06T18:57:48.6591041Z { 2019-11-06T18:57:48.6591180Z "code": "ComponentStatus/StdOut/succeeded", 2019-11-06T18:57:48.6591456Z "displayStatus": "Provisioning succeeded", 2019-11-06T18:57:48.6591871Z "level": "Info", 2019-11-06T18:57:48.6592682Z "message": "<redacted_for_security_reasons>", 2019-11-06T18:57:48.6594233Z "time": null 2019-11-06T18:57:48.6594425Z }, 2019-11-06T18:57:48.6594576Z { 2019-11-06T18:57:48.6594755Z "code": "ComponentStatus/StdErr/succeeded", 2019-11-06T18:57:48.6594923Z "displayStatus": "Provisioning succeeded", 2019-11-06T18:57:48.6595246Z "level": "Info", 2019-11-06T18:57:48.6595454Z "message": "out-file : Could not find a part of the path <redacted_for_security_reasons>", 2019-11-06T18:57:48.6595793Z "time": null 2019-11-06T18:57:48.6595911Z } 2019-11-06T18:57:48.6596048Z ] 2019-11-06T18:57:48.6596165Z }

Undesired:
2019-11-06T19:35:48.8660098Z RunPowerShellScript 2019-11-06T19:36:52.2997793Z ERROR: Deployment failed. Correlation ID: c358a072-4307-441d-a65d-3210e90938ea. VM has reported a failure when processing extension 'RunCommandWindows'. Error message: "Finished executing command".

I was only able to repro the scenario of: Error message: "Finished executing command" when I provided a script that didn't actually exist. When I provide a real script I get the actual error output from the script.

Thanks @Drewm3 . I'm wondering if there is a bad parameter or something in my invocation then that it's still able to process the params that come earlier in the command but eventually fails and returns that error when it hits the bad params. Perhaps quote escaping or something? I'll play with it a bit more and see if I can come with a more concrete answer. But it definitely executes the script I'm sending, performs the desired action, and still returns that error message. The script I'm firing is basically the azure pipeline agent registration script that Azure DevOps generates with some minor modifications and parameterization (azd_registration_script.txt attached). That script is being fired by Azure CLI from Azure DevOps using an Oauth token (az_cli_invocation.txt attached).

As an aside, if you know of a way to register VM's automatically via terraform or some other better tooling that is not manual in nature feel free to share. But this method does work aside from the difficulty that comes about when it fails silently without a clear error.

azd_registration_script.txt
az_cli_invocation.txt

Also please disregard the trailing backtick on the first script, I initially tried to paste the script as code but github was not formatting it well. I suspect that was a result of all the powershell syntax, so I went with a file but forgot to remove that.

On errors, it should return exit code 1 along with outputs the script made until it failed. Current behavior makes it impossible to get both useful info for debugging and success/failure result of execution.

$ az vm run-command invoke --ids $vm_id --command-id RunPowerShellScript --scripts "echo hoge"
{
  "value": [
    {
      "code": "ComponentStatus/StdOut/succeeded",
      "displayStatus": "Provisioning succeeded",
      "level": "Info",
      "message": "hoge",
      "time": null
    },
    {
      "code": "ComponentStatus/StdErr/succeeded",
      "displayStatus": "Provisioning succeeded",
      "level": "Info",
      "message": "",
      "time": null
    }
  ]
}
$ az vm run-command invoke --ids $vm_id --command-id RunPowerShellScript --scripts "echo hoge; exit 1"
Deployment failed. Correlation ID: 5fc12f27-e0b3-4aae-b841-ce8490c96dfa. VM has reported a failure when processing extension 'RunCommandWindows'. Error message: "Finished executing command".
$ az vm run-command invoke --ids $vm_id --command-id RunPowerShellScript --scripts "Write-Error hoge"
{
  "value": [
    {
      "code": "ComponentStatus/StdOut/succeeded",
      "displayStatus": "Provisioning succeeded",
      "level": "Info",
      "message": "",
      "time": null
    },
    {
      "code": "ComponentStatus/StdErr/succeeded",
      "displayStatus": "Provisioning succeeded",
      "level": "Info",
      "message": "C:\\Packages\\Plugins\\Microsoft.CPlat.Core.RunCommandWindows\\1.1.2\\Downloads\\script19.ps1 : hoge\n    + CategoryInfo          : NotSpecified: (:) [Write-Error], WriteErrorException\n    + FullyQualifiedErrorId : Microsoft.PowerShell.Commands.WriteErrorException,script19.ps1\n ",
      "time": null
    }
  ]
}
$ az vm run-command invoke --ids $vm_id --command-id RunPowerShellScript --scripts "Write-Error hoge -ErrorAction Stop"
Deployment failed. Correlation ID: 58a940be-580e-4eb2-83a8-17b467f0de8e. VM has reported a failure when processing extension 'RunCommandWindows'. Error message: "Finished executing command".

@koralski, could you look into these latest examples? It looks like the problem is how the custom script is handling the cases when the script exits with a non-zero value. It seems that there should be a way to indicate in the run command that Azure successfully attempted to run the script, but the script execution failed along with the existing StdOut and StdErr streams.

If I am correct, then should this be treated as a feature enhancement for Runcommand or a bug in the current implementation?

If I am incorrect, then could you provide more details on what is causing the behavior above?

We investigated the issue and found that in some cases where we have script errors, powershell process exit with code different from 0 which makes the extension provisioning to fail and report a generic error instead of captures output.
Fix is in process and should be public by the end of the week (12/8/2019).

Update: Fix is deployed.

Closing this now that the fix is deployed

Was this page helpful?
0 / 5 - 0 ratings