Packer: winrm hyperv 401 error

Created on 26 Apr 2018  ·  27Comments  ·  Source: hashicorp/packer

Packer 1.2.1 on Windows 10
Using winrm to spin win 7 Vm

Have a existing Hyperv Windows 7 VM that is exported
Have a packer script clone this VM, Winrm communicator with ssl

"type": "hyperv-vmcx",
 "clone_from_vmxc_path": "C:\\export\\Windows 7 Test",
"winrm_password": "{{ user `password`}}",
 "winrm_username": "{{ user `username`}}",
"winrm_timeout": "1h",      
"winrm_insecure": true,
"winrm_use_ntlm": true,
 "winrm_use_ssl": true,

checked firewall/ports open on the host and target VM.
Able to tcp on 5986 port onto the new VM.
Added the target ip address onto the trusted hosts in the host VM.

$so = New-PsSessionOption -SkipCACheck -SkipCNCheck
Enter-PSSession -ComputerName <ipadr>-UseSSL -Credential (Get-Credential) -SessionOption $so

Able to use ssl and access the VM from powershell using above command.
But winrm hangs throwing 401 invalid content error.

==> hyperv-vmcx: Creating temporary directory...
==> hyperv-vmcx: Creating floppy disk...
    hyperv-vmcx: Copying files flatly from floppy_files
    hyperv-vmcx: Copying file: floppy/Hello.ps1
    hyperv-vmcx: Done copying files from floppy_files
    hyperv-vmcx: Collecting paths from floppy_dirs
    hyperv-vmcx: Resulting paths from floppy_dirs : []
    hyperv-vmcx: Done copying paths from floppy_dirs
==> hyperv-vmcx: Creating switch 'packer-hyperv-iso' if required...
==> hyperv-vmcx:     switch 'packer-hyperv-iso' already exists. Will not delete on cleanup...
==> hyperv-vmcx: Cloning virtual machine...
==> hyperv-vmcx: Enabling Integration Service...
==> hyperv-vmcx: Mounting floppy drive...
==> hyperv-vmcx: Skipping mounting Integration Services Setup Disk...
==> hyperv-vmcx: Mounting secondary DVD images...
==> hyperv-vmcx: Configuring vlan...
==> hyperv-vmcx: Starting the virtual machine...
==> hyperv-vmcx: Waiting 10s for boot...
==> hyperv-vmcx: Host IP for the HyperV machine: ipaddr
==> hyperv-vmcx: Typing the boot command...
==> hyperv-vmcx: Waiting for WinRM to become available.

Packer log
WinRM connection err: http response error: 401 - invalid content type

bug buildehyperv

Most helpful comment

Sorry, had some issues with my build server not related to this.
Anyways, I tried versions 1.2.1, 1.3.1 and 1.3.2 and adding this line :
Set-ItemProperty -Path HKLM:\SYSTEM\CurrentControlSet\Control\Lsa\MSV1_0 -Name NTLMMinServerSec -Value 537395200
to the first of two tasks (second was the sysprep step) broke the build using 1.2.1 and 1.3.1 but 1.3.2 kept working, so I can confirm this fixed our issue! Thanks!

All 27 comments

Can I have the full debug logs, please? Set the environment variable PACKER_LOG=1 and rerun.

log.txt
I have the log file attached,

I ran a few tests to check the different scenarios which yielded the error:
1) with winrm_ssl :false tested with 5985 port. Got 401 invalid content type
2) with winrm_ssl:true and winrm_insecure:true, which does not verify certs at the remote end, still 401 error
3) with winrm_ssl:true and winrm_insecure:false which verify certs at the remote end, I got
unknown error Post https://:5986/wsman: x509: cannot validate certificate for because it doesn't contain any IP SANs
Case 3 is known since my self signed certs at the remote host has the host name but not the ipaddress.
I am able to use a power shell session like :
Enter-PSSession -ComputerName -Credential (Get-Credential) an able to successfully login.
I have the winrm service running, ports/firewall open/, http and https listeners configured on host and remote machine, with encrypted setting true using group policy editor.

Any help to decode the invalid 401 error is very much appreciated,
I want to know how the winrm credentials are transmitted using packer since the same credentials works in powershell Get-credemtials cmdlet.

My initial guess is that your username and password are escaped in a way we don't expect.

This might be a good question to bring to our mailing list, since you're more likely there to run into a person who has the same setup as you: https://groups.google.com/forum/#!forum/packer-tool .

Hi, I started to experience the same problem after setting winrm_use_ntlm to true.

Packer version: 1.2.3
Builder: amazon-ebs
Communicator: winrm
OS: Windows Server 2016

Log:

[INFO] Attempting WinRM connection...
[DEBUG] connecting to remote shell using WinRM
[ERROR] connection error: http response error: 401 - invalid content type
[ERROR] WinRM connection err: http response error: 401 - invalid content type

On the mailing list, you mentioned that wireshark showed a malformed NTLM packet. Do you have any more details on that?

I was interested in this issue and dig some digging myself, here is a way to run through this scenario and contains some of the stuff I am going to refer to https://github.com/jborean93/github-misc/tree/master/packer/issue-6205. What seems to be happening

  • In OPs case, Packer is connecting with Basic auth but this is disabled by default on Windows, causing the 401 errors
  • The latter case, Packer is connecting with NTLM auth but the underlying libraries do not support message encryption
  • The NTLM authentication process is fine, even though WireShark says there's a malformed packet.
  • Unless the Windows host is configured to allow un-encrypted messages then it will constantly send a 401 error

Digging a bit further, here is the WireShark capture for Packer (frame 736 of https://github.com/jborean93/github-misc/blob/master/packer/issue-6205/wireshark.pcang)

image

You can see it is sending a negotiate message with the value TlRMTVNTUAABAAAABYIAAA==. In hex this message is

4E 54 4C 4D 53 53 50 00    (NTLMSSP\0)
01 00 00 00 05 82 00 00    (01 00 00 00 is message 1, 05 82 00 00 is the flags)

This follows the correct structure up and until the domain name/workstation/version is supplied (it is omitted in this case). While they are part of the spec I don't think they are mandatory and WireShark is just being pedantic about it and Windows responds correctly.

While WIreShark see's it as a malformed packet, Windows is fine and sends the challenge and eventually the auth process completes. In this case the host has already allowed unencrypted messages so packet 744 indicates a 200 and the dance continues.

When looking at the Ansible capture (frame 766), we get the same process but it is also encrypting the WinRM data as the underlying library supports that. For a comparison let's get the negotiate message sent in that exchange TlRMTVNTUAABAAAAMpCI4gAAAAAoAAAAAAAAACgAAAAGAbEdAAAADw==

image

Analysing the NTLM token as a hex string we get;

4E 54 4C 4D 53 53 50 00    (NTLMSSP\0)
01 00 00 00 32 90 88 E2    (01 00 00 00 message 1, 32 90 88 E2 is the flags)
00 00 00 00 28 00 00 00    (00 00 00 00 domain name length is 0, 28 00 00 00 starts at offset 40)
00 00 00 00 28 00 00 00    (00 00 00 00 workstation name length is 0, 28 00 00 00 starts at offset 40)
6B 01 B1 1D 00 00 00 0F    (6B 01 B1 1D 00 00 00 0F version is 6.1 (Build 7601); NTLM Revision 15)

This message contains the info regarding the domain name, workstation, and version (even though the values for these are not set/null) and this is why Wireshark does not see them as a malformed packet.

TLDR: I don't think the issue lies with Packer, they seem to follow the spec but the underlying WinRM library doesn't support message encryption yet. This is what the users should be doing in order to connect successfully

  • @sg071017 by default basic auth is not available and Packer defaults to using that, you either need to enable it in your bootstrapping script Set-Item -Path WSMan:\localhost\Service\Auth\Basic -Value $trueor set winrm_use_ntlm to True in your packer.json
  • @SNikalaichyk because you are on a hardened OS, you need to use NTLM over HTTPS

Either of those scenarios described in this issue will result in with Windows sending a 401 response and seems to fit in with the environment's you are both talking about.

@jborean93,
I use NTLM over HTTPS, and am able to successfully connect to the machine using PowerShell Remoting and Ansible, but not from Packer. Please find more details below.

The LocalAccountTokenFilterPolicy registry value is set to 1:

PS C:\> Get-ItemProperty -Path HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System |
>> Select-Object -Property LocalAccountTokenFilterPolicy

LocalAccountTokenFilterPolicy : 1

WinRM configuration:

C:\> winrm get winrm/config/service

Service
    RootSDDL = O:NSG:BAD:P(A;;GA;;;BA)(A;;GR;;;IU)S:P(AU;FA;GA;;;WD)(AU;SA;GXGW;;;WD)
    MaxConcurrentOperations = 4294967295
    MaxConcurrentOperationsPerUser = 1500
    EnumerationTimeoutms = 240000
    MaxConnections = 300
    MaxPacketRetrievalTimeSeconds = 120
    AllowUnencrypted = false [Source="GPO"]
    Auth
        Basic = false [Source="GPO"]
        Kerberos = true
        Negotiate = true
        Certificate = false
        CredSSP = false
        CbtHardeningLevel = Relaxed
    DefaultPorts
        HTTP = 5985
        HTTPS = 5986
    IPv4Filter = *
    IPv6Filter = *
    EnableCompatibilityHttpListener = false
    EnableCompatibilityHttpsListener = false
    CertificateThumbprint
    AllowRemoteAccess = true

C:\> winrm enumerate winrm/config/listener

Listener
    Address = *
    Transport = HTTPS
    Port = 5986
    Hostname = EC2AMAZ-CGNE1NO
    Enabled = true
    URLPrefix = wsman
    CertificateThumbprint = 6FB3D80B606EC9408B228323CDFF11B1B2A41E6D
    ListeningOn = < ... >

Working Ansible configuration:

---
# file: packer/ansible/group_vars/windows.yml

ansible_connection: winrm
ansible_port: 5986
ansible_winrm_server_cert_validation: ignore
ansible_winrm_transport: ntlm

Working PowerShell test script:

$ipAddress = '< IPAddress >'
$password = '< RandomlyGeneratedEC2Password >'

$splatParams = @{
    ComputerName = $ipAddress
    Authentication = 'Negotiate'
    UseSSL = $true
    Port = 5986
    SessionOption = (New-PSSessionOption -SkipCACheck -SkipCNCheck)
    Credential = [PSCredential]::new(
        'Administrator',
        (ConvertTo-SecureString -String $password -AsPlainText -Force)
    )
    ScriptBlock = {$env:COMPUTERNAME}
}

Invoke-Command @splatParams

image

Packer communicator settings:

< ... >
"communicator": "winrm",
"winrm_username": "Administrator",
"winrm_use_ntlm": true,
"winrm_use_ssl": true,
"winrm_insecure": true,
< ... >

@SNikalaichyk you just had to throw a spanner in the works :) very curious as to why it still doesn’t work for you. The only remaining thing I know off is the CbtHardeningLevel being set to strict, I don’t know whether the Packer WinRM lib works with that but I see in your output you have it as Relaxed which probably rules that out. What type of certificate is being used for the listener, is it using sha1/sha256, something else? What would happen if you change that CBT level to None Set-Item -Path WSMan:\localhost\service\auth\cbthardeninglevel -Value None?

Also @SwampDragons, I know this is been discussed before (many times) and really don't want to beat a dead horse but this is one of the main reasons I wanted to get the ansible-remote provisioner to run commands through the Ansible stack and not through the Packer communicator. The story around WinRM is a lot more mature in Python compared to Go as it supports like like Kerberos/CredSSP auth as well as message encryption over HTTP for NTLM/Kerb/CredSSP whereas Go does not. I haven't looked at the details but I would guess that the CBT stuff also isn't supported in the WinRM lib while it works in Python.

The last time I (briefly) looked into this was https://github.com/hashicorp/packer/issues/4904#issuecomment-349832004. At the time this wasn't possible as the necessary information wasn't exposed to the plugin due to the design of Packer. If at any point you do decide to revisit this and choose to expose this information, giving the provisioners a choice, I'm happy to try and add support for the native Ansible connection plugins.

Once again this isn't trying to be negative, I really enjoy what Packer does and can work around these issues by calling Ansible with shell-local.

The "offending" CIS Benchmark settings appear to be:

  • 2.3.11.7 (L1) Ensure 'Network security: LAN Manager authentication level' is set to 'Send NTLMv2 response only. Refuse LM & NTLM' (Scored)
  • 2.3.11.10 (L1) Ensure 'Network security: Minimum session security for NTLM SSP based (including secure RPC) servers' is set to 'Require NTLMv2 session security, Require 128-bit encryption' (Scored)

The ~solution~ workaround was to add the following to the WinRM bootstrap script:

Set-ItemProperty -Path 'HKLM:\SYSTEM\CurrentControlSet\Control\Lsa' -Name 'LmCompatibilityLevel' -Value 2 -Type DWord -Force
Set-ItemProperty -Path 'HKLM:\SYSTEM\CurrentControlSet\Control\Lsa\MSV1_0' -Name 'NTLMMinServerSec' -Value 536870912 -Type DWord -Force

Ah thanks for the clarification @SNikalaichyk seems like the NTLM Go library doesn't fully support NTLMv2 with extended security properly. I'm pretty sure it was sending NTLMv2 hashes but potentially it was also sending the LM/NTLMv1 hash as well causing it to fail. I didn't look too closely at the Authentication message so can't confirm for sure.

We have this same problem using packer to build Windows2016 on both Azure and AWS.

On AWS its not a problem because we just use an old version of packer, on Azure we need at least 1.2.1 because thats the first version to support the plan_info so we needed a workaround:

First set the DSC:

Registry SV-88361r1 {
      Ensure    = "Present"
      Key       = "HKLM:\SYSTEM\CurrentControlSet\Control\Lsa\MSV1_0\"
      Force     = $true
      ValueName = "NTLMMinServerSec"
      ValueData = "537395200"

      ValueType = "Dword" 
} 

Then at the end of the DSC script we temporarily set it to the "working" value:
Set-ItemProperty -Path HKLM:\SYSTEM\CurrentControlSet\Control\Lsa\MSV1_0 -Name NTLMMinServerSec -Value 536870912

And in the last and final script (the sysprep one) we set it to the correct one again:
Set-ItemProperty -Path HKLM:\SYSTEM\CurrentControlSet\Control\Lsa\MSV1_0 -Name NTLMMinServerSec -Value 537395200

That way at least our compliance check wont fail.

Is there any news on this issue, if needed I'll create a seperate issue but this one looks to be identical.

Still broken in current release.

@jborean93 @SwampDragons Anything I can help with here to get this fixed? I can route this through Microsoft back-channels if a fix for go-ntlmssp is needed

@riverar I'm honestly not sure as I haven't looked at the go-ntlmssp code so I can't say for sure where the error is actually occurring. Unfortunately NTLM get's a bit hairy when looking at generating the NT hash for NTLMv2 as there's quite a few permutations around what fields to include/omit in the authentication message sent to the server.

I may have some time to track down what the actual problem is but it all depends on if I can find some free time. As for fixing it, it really depends on what the actual problem is.

Looking at some of the comments since, it seems like NTLMMinServerSpec is key to this, this page has some details on what the value set means, i.e.

  • (Working) 536870912 == 0x20000000, fails if 128-bit encryption is not negotiated
  • (Fails) 537395200 == 0x20080000, fails if above + NTLMv2 session security is not negotiated

Cool, I have spare cycles and can take a look as well.

@jborean93 Spent a whole day tracing through tedious protocol back and forth and believe I have some answers:

  • Packer's masterzen/winrm dep does not support GSSAPI nor Kerberos, so cannot communicate with encrypted payloads. All target Windows machines communicating over HTTP must winrm set winrm/config/service @{AllowUnencrypted="true"}.
  • Packer needs to bring in latest Azure/go-ntlmssp dep to take in a fix (https://github.com/Azure/go-ntlmssp/commit/4a21cbd618b459155f8b8ee7f4491cd54f5efa77) for a bit not getting set during negotiation. This would cause failures on machines requiring NTLMv2 session security (where NtlmMinServerSpec is 0x20080000).
  • azure/go-ntlmssp itself is okay. As of latest commit, it communicates NTLMv2 with session security correctly. It generates the correct NTLMv2_RESPONSE structure, hashes, etc. and I can reproduce success with a dev copy of packer + a shiny new Windows 10 RS5 machine.
  • Tinkering around with LmCompatibilityLevel is not necessary or recommended.
  • Tinkering around with NtlmMinServerSpec is only necessary for current version of packer (0x0 or 0x20000000 are okay, anything higher is not). This is a bug and should get fixed asap.
  • Tinkering around with LocalAccountTokenFilterPolicy is not normally required if users use winrm quickconfig, otherwise due to UAC, it must be set to 1 to allow connections from non-built-in-Administrator.

(revision: Fixed AllowUnencrypted scenario per @jborean93's guidance)

Awesome work, sounds like https://github.com/Azure/go-ntlmssp/commit/4a21cbd618b459155f8b8ee7f4491cd54f5efa77 is the root cause for this issue in general and the work has already been done.

Just a quick follow up on some of your posts

Packer's masterzen/winrm dep does not support GSSAPI nor Kerberos, so cannot communicate with encrypted payloads. All target Windows machines must winrm set winrm/config/service @{AllowUnencrypted="true"}. This is regardless of HTTP or HTTPS transport.

Yes it is true it doesn't support GSSAPI Wrapping/Unwrapping but you only need to disable the encryption check when running over HTTP. Running over HTTPS is enough to pass the encryption check and this SHOULD no be set to true if running over HTTPS. Anyway message encryption with NTLM is pretty abysmal (128-bit RC4 cipher) so I would recommend using HTTPS anyway.

azure/go-ntlmssp itself is okay. As of latest commit, it communicates NTLMv2 with session security correctly. It generates the correct NTLMv2_RESPONSE structure, hashes, etc. and I can reproduce success with a dev copy of packer + a shiny new Windows 10 RS5 machine.

Agreed, I remember looking at the packets and they seemed like it was using NTLMv2, seems like not having that flag set caused the issue.

Tinkering around with LmCompatibilityLevel is not necessary or recommended.

Agreed, no reason to set this to 2 or below as go-ntlmssp supports NTLMv2.

Tinkering around with LocalAccountTokenFilterPolicy is not normally required if users use winrm quickconfig, otherwise due to UAC, it must be set to 1 to allow connections from non-built-in-Administrator.

A quick side note, winrm quickconfig will actually set this value for you, not negate the need for it. The Enable-PSRemoting cmdlet in PowerShell has a bug where it will not set this value if a listener is already active. This is problematic in Server 2012 R2 and newer as the listener is enabled by default.

Thanks for looking into this again.

Yes it is true it doesn't support GSSAPI Wrapping/Unwrapping but you only need to disable the encryption check when running over HTTP. Running over HTTPS is enough to pass the encryption check and this SHOULD no be set to true if running over HTTPS. Anyway message encryption with NTLM is pretty abysmal (128-bit RC4 cipher) so I would recommend using HTTPS anyway.

Ah, yes, missed the HTTPS scenario, thanks! I'll submit a PR for updating the deps and look at clearing up the docs a bit in a moment.

Okay, thanks @riverar, awesome; I merged it as it seems pretty safe to update the vendors and it worked for me.
To be sure, can anyone tell me if it fixed it for them.

packer_windows_amd64.zip

packer_linux_amd64.zip

mac: packer_darwin_amd64.zip

Closing this one for now, we'll reopen if there is a problem 🙂

@sg071017 @SNikalaichyk @mbrouwer Heads up.

Sorry, had some issues with my build server not related to this.
Anyways, I tried versions 1.2.1, 1.3.1 and 1.3.2 and adding this line :
Set-ItemProperty -Path HKLM:\SYSTEM\CurrentControlSet\Control\Lsa\MSV1_0 -Name NTLMMinServerSec -Value 537395200
to the first of two tasks (second was the sysprep step) broke the build using 1.2.1 and 1.3.1 but 1.3.2 kept working, so I can confirm this fixed our issue! Thanks!

sweet, thanks for confirming 🙂

Still not able to get packer/winrm to work with a source CIS hardened windows image (packer ebs builder).

This is an example of the user-data script I am running:

https://gist.github.com/AndrewCi/4ffb3f7094a765b8631623e3b96d3011

This is the error I'm getting while waiting for WinRM to connect:

"[ERROR] connection error: unknown error Post https://xx.xxx.xx.xxx:5986/wsman: dial tcp xx.xxx.xx.xxx:5986: i/o timeout"

I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Tensho picture Tensho  ·  3Comments

frezbo picture frezbo  ·  3Comments

jesse-c picture jesse-c  ·  3Comments

paulcdejean picture paulcdejean  ·  3Comments

wduncanfraser picture wduncanfraser  ·  3Comments