Packer: Generation 2 Hyper-V VM boots too fast for boot_command to trigger

Created on 5 Feb 2019  ยท  64Comments  ยท  Source: hashicorp/packer

I'm trying to build a generation 2 Windows Server 2016 VM on Windows 10 with the hyper-v role installed. I have the exact same issue as janegilring in the quote below. More info in a minute, just need to get out of this "Reference new issue" popup which just closed on me.

"I was just setting up the same Packer build configuration in a different environment (lab - slower hardware).
The issue in that environment seems to be the opposite: While Packer is in the "Starting the virtual machine..." state, the VM has already started and the "Press any key to start installation" screen is gone when Packer gets to the waiting state. Even when setting the boot wait to 0 seconds, Packer is too slow to type the boot commands.
However, I suppose that`s another issue so I'll create one after some more testing.

_Originally posted by @janegilring in https://github.com/hashicorp/packer/issues/6208#issuecomment-384878910_"

buildehyperv upstream-bug

Most helpful comment

@SwampDragons - this worked for me.

All 64 comments

Log output:
==> hyperv-iso: Creating build directory...
==> hyperv-iso: Retrieving ISO
hyperv-iso: Using file in-place: file:///C:/Automation/ISO/Newest2016/windows2016.ISO
==> hyperv-iso: Starting HTTP server on port 8068
==> hyperv-iso: Creating switch 'internal_switch' if required...
==> hyperv-iso: switch 'internal_switch' already exists. Will not delete on cleanup...
==> hyperv-iso: Creating virtual machine...
==> hyperv-iso: Enabling Integration Service...
==> hyperv-iso: Setting boot drive to os dvd drive C:/Automation/ISO/Newest2016/windows2016.ISO ...
==> hyperv-iso: Mounting os dvd drive C:/Automation/ISO/Newest2016/windows2016.ISO ...
==> hyperv-iso: Skipping mounting Integration Services Setup Disk...
==> hyperv-iso: Mounting secondary DVD images...
==> hyperv-iso: Mounting secondary dvd drive ./windows/2016/answer.iso ...
==> hyperv-iso: Configuring vlan...
==> hyperv-iso: Starting the virtual machine...
==> hyperv-iso: Attempting to connect with vmconnect...
==> hyperv-iso: Host IP for the HyperV machine: 192.168.10.103
==> hyperv-iso: Typing the boot command...
==> hyperv-iso: Waiting for WinRM to become available...

When Packer gets to the "Typing the boot command...." part the VM is already way past the "Press any key to boot from cd or dvd" prompt.

I have tried to start up in headless mode but the VM still starts too fast. I'm not really sure if there is any solution to this other than building an ISO which doesn't prompt me to press a key to start the installation. I have had plenty of success with building generation 1 VMs on the same Windows 10 machine, but I don't see the prompt here though. Below is the template I'm using.

{
"builders": [
{
"boot_wait": "0s",
"boot_command": [ "aaaaaaa" ],
"configuration_version":"9.0",
"vm_name":"windows2016",
"type": "hyperv-iso",
"disk_size": 76800,
"floppy_files": [],
"secondary_iso_images": [
"./windows/2016/answer.iso"
],
"headless": false,
"http_directory": "./windows/common/http/",
"guest_additions_mode":"disable",
"iso_url": "../ISO/Newest2016/windows2016.ISO",
"iso_checksum_type": "none",
"iso_checksum": "e3779d4b1574bf711b063fe457b3ba63",
"communicator":"winrm",
"winrm_username": "vagrant",
"winrm_password": "vagrant",
"winrm_timeout" : "4h",
"shutdown_command": "shutdown /s /t 10 /f /d p:4:2 /c \"Packer Shutdown\"",
"ram_size": 2048,
"cpu": 1,
"generation": 2,
"switch_name": "internal_switch",
"enable_secure_boot":true
}
],
"provisioners": [
{
"type": "powershell",
"elevated_user":"vagrant",
"elevated_password":"vagrant",
"scripts": [
"./windows/common/cleanup.ps1"
]
}
],
"post-processors": [
{
"type": "vagrant",
"keep_input_artifact": false,
"output": "{{.Provider}}_windows-2016.box"
}
]
}

@marcinbojko I know you've done a lot with generation 2 windows VMs -- do you have any insights for a workaround here? I don't think there's really anything Packer can do here because Gen 2 vms just blast through the boot sequence so fast.

@SwampDragons - what's funny - having a lots of different Hyper-V stacks (different baremetal and versions), different DVD/isos to test I can say one thing: it's unpredictable ;)
Unfortunately, the only workaround I've found is to make boot loop with:

      "boot_command": [
        "a<enter><wait>a<enter><wait>a<enter><wait>a<enter>"
      ],

@SwampDragons I'd suggest maybe using a feature called 'start delay', as it's better for packer to wait a sec or ten, then just let VM Gen2 to fly.
image

The name of a feature is here:

 get-vm -Name ito-el6-n1.spcph.local|select name,automaticstartdelay

Name                   AutomaticStartDelay
----                   -------------------
ito-el6-n1.spcph.local                   0

Startup_delay is a great hint! I'll add it to the hyper-v docs.

Hey guys, I really appreciate your suggestions on the issue here. Unfortunately the AutomaticStartDelay setting won't help much here as it doesn't slow down the boot process when the VM gets the initial start trigger.

What AutomaticStartDelay really does is preventing a boot storm when a hyper-v host, or an entire hyper-v cluster, running many VMs, are rebooted.

Example:
VM1 is running on host1
VM1 has AutomaticStartDelay set to 60 seconds
Host1 is rebooted
VM1 was originally running on host1 prior to reboot so VM1 will automatically startup again when the hyper-v service has started
Hyper-V waits 60 seconds before powering on/starting VM1
After 60 seconds VM1 powers on and runs through the boot process as fast as possible.

I'll take a look at the boot_command tweak suggested here. My current boot_command string is currently: "boot_command": [ "a<wait>a<wait>a<wait>a<wait>a<wait>a<wait>a" ],

It doesn't seem to have any effect in the VM though as I don't see the VM rebooting multiple times. It could actually work though I guess. I'll grab some screenshots in order to get you a better understanding of what happens at my end.

Hmm I tried the following settings, but the VM doesn't seem to get any input from packer at all:
"boot_wait": "5s",
"boot_command": ["<leftCtrlOn><leftAltOn><endOn><leftCtrlOff><leftAltOff><endOff><wait>a<enter>"],

@KimRechnagel - my initial understanding of your problem was that packer was too slow to start interfering with VM's boot menu - in this case AutomaticStartDelay is a key to it.
I don't recall to have these issues, even on super-duper fast hosts with SSD storage.
Could you start gathering data? Packer version, what terminal you're using (cmd, powershell, conemu).
Also I'd say - let's try change ISO as i recall some of latest releases ( I am using Partner channel though) had problems with boot_command - can you download and check just generic Windows 2016 Evaluation ISO?
Last but not least, could you try my templates?
https://github.com/marcinbojko/hv-packer

@SwampDragons - it's not so great, as it has to be set by packer during the VM creation ;) I'd suggest to add this option to packer commands (of course in the code also) to be able to slow down a little for super fast VMs.

@marcinbojko Yes, packer is too slow to start interfering with the boot menu, or the VM is too fast for vmconnect.exe, which I can see in the code that packer is using, to connect to the VM.

I'm not trying to be rude, but AutomaticStartDelay has nothing to do with this issue as this setting works exactly as I described above. I tested it locally on my machine by setting AutomaticStartDelay to 10 seconds and then starting the VM. It doesn't delay anything after the start request has been sent to the VM, it just tells the host to wait X seconds to send the start request to the VM when the host eg. has been rebooted.

I'll test with another ISO and will also collect data about my system, versions etc. as per your suggestion.

Thanks for your feedback.

@KimRechnagel - no worries, startdelay would be recommended in our first understanding of your problem - which we already ruled out.

Hmm maybe the "solution" could be as simple as getting packer to connect to the VM before sending the Start-VM cmdlet.

I just "tested" it manually and what happens is that I connect to the VM and see the black console. When I hit the start button it still takes vmconnect about 3-4 seconds to actually display the boot screen. I see the "Press any key to boot from CD or DVD..." for about 1 second before it times out and tries to PXE boot instead.

I guess the issue might just be that vmconnect.exe is too slow to connect. Well, I'll look into that as well.

@KimRechnagel - what would happen if you'll switch to exhanced session (in vmconnect) for this particular packer VM?

@marcinbojko Enhanced session was already enabled. I disabled it but unfortunately it didn't change anything.

I did test something else, but it raises a lot of other challenges with DHCP/PXE etc. but if I change the boot order to be:
Harddrive
Network Adapter
DVD Drive (my install ISO)
DVD Drive (answer.iso with autounattend.xml etc.)

Then the VM waits for PXE to time out and vmconnect has plenty of time to connect to the VM. The problem with this is that then I only have a small window to send the boot commands during the end of the PXE timeout and when the "Press any key to boot...." times out. Furthermore if I had a DHCP/BOOTP on my network, that would complicate the boot process even more.

A question regarding boot_commands on hyper-v; The documentation states that I can add "On" to e.g. <LeftCtrl> in order for packer to hold down the key, which would allow me to send Ctrl, Alt, End (reboot). but it doesn't seem to work. Maybe because the scancodes haven't been implemented in the same way on hyper-v as e.g. VirtualBox, VMWare etc?

I tried with
"boot_command": ["<leftCtrlOn><leftAltOn><endOn><leftCtrlOff><leftAltOff><endOff><wait>a<enter>"],
But it didn't do anything. Well, maybe my issue is that the boot_command aren't sent at all :-)

I never saw the "Press any key to boot..." when creating Gen1 VMs, so I don't actually know if boot_command works on my setup.

Still waiting for the eval ISO to download.

My current settings:
image

How far into the installation is this? Did it just start?
I don't see the bootmgfw.efi in my settings.

That's interesting - my packer just goes through 3rd batch of WU.
As far as I know (in 2016/2019) Gen2 machine should have this file.

I tested the templates you linked from your github repo. I used the ISO which I have downloaded from the VLSC site. Same issue. I'll test again with the eval ISO in about 20 minutes when it has finished downloading.

My settings with your template:
image

It seems like your Hyper-V host is physical, or at least running on Server 2016? I'm testing with my llaptop with the latest version of windows 10. It might make a difference when building Gen 2 machines.

True. I am not a windows guy, however I'll try with w10.

Ok the evaluation ISO finished downloading. I didn't change anything but the ISO, I used your templates... and it works. It's very odd... it seems like the eval ISO waits just about 1-2 seconds longer at the "Press any key to boot" prompt, which means that packer has time to connect and send the boot_command.

Yup, that's what I noticed in thread you were mentioning. Switching to different ISO (Partner channel) broke my deployment flow. BLAME Microsoft?

It does not work with the template I modified myself. I tested two times now and the boot_command does not seem to be sent. I'll tweak the settings one line at a time until I figure out what triggers this.

Wow this is weird. I managed to "break" your template as well by changing: "iso_url": ".\\iso\\Windows_Server_2016_Datacenter_EVAL_en-us_14393_refresh.ISO",

To: "iso_url": "../ISO/Newest2016/Windows_Server_2016_Datacenter_EVAL_en-us_14393_refresh.ISO",

Changed it back, and it worked again

Well, now your template fails again. It consistently failed three times in a row. This is odd. There is a very very fine balance between when it works and not.
I'll keep testing.

I just tested with w10 1803 (don't have 1809 as it fails to upgrade).
With Partner ISO i have no way of even booting, 'press a key displays for 1 second' and it's gone.
@SwampDragons - I am sorry to say that but it's related to previous issue - packer has no chance to react so fast in current setup. As we probably cannot rely on Microsoft to rebuild all ISO images we need probably better controll over how fast vmconnect reacts.

With debug and headless: false:

==> hyperv-iso: Configuring vlan...
==> hyperv-iso: Starting the virtual machine...
==> hyperv-iso: Attempting to connect with vmconnect...
==> hyperv-iso: Host IP for the HyperV machine: 169.254.2.24
==> hyperv-iso: Typing the boot command...
2019/02/06 11:30:05 packer.exe: 2019/02/06 11:30:05 Sending char 'a', code '1e9e', shift false
2019/02/06 11:30:05 packer.exe: 2019/02/06 11:30:05 Sending char 'a', code '1e9e', shift false
2019/02/06 11:30:05 packer.exe: 2019/02/06 11:30:05 Special code 'Press' '' found, replacing with: &{[1c] [9c]}
2019/02/06 11:30:05 packer.exe: 2019/02/06 11:30:05 Sending char 'a', code '1e9e', shift false
2019/02/06 11:30:05 packer.exe: 2019/02/06 11:30:05 Special code 'Press' '' found, replacing with: &{[1c] [9c]}
2019/02/06 11:30:05 packer.exe: 2019/02/06 11:30:05 Sending char 'a', code '1e9e', shift false
2019/02/06 11:30:05 packer.exe: 2019/02/06 11:30:05 Special code 'Press' '' found, replacing with: &{[1c] [9c]}
2019/02/06 11:30:11 packer.exe: 2019/02/06 11:30:11 [DEBUG] Unable to get address during connection step: No ip address.
2019/02/06 11:30:11 packer.exe: 2019/02/06 11:30:11 Waiting for WinRM, up to timeout: 8h0m0s

Before vmconnect displays, it's long gone, and I can get a PXE boot menu only.

I did some tests with your template marcinbojko here is the result as well as the changes I made during testing. I guess the conclusion is: Do not build Gen 2 VMs on Hyper-V using super fast storage.

Attempt #

  1. Failed
  2. Failed
  3. Failed
  4. Change: Set vmcomput.exe priority to low - worked once
  5. Failed
  6. Failed
  7. Change: Set vmcompute.exe affinity to CPU0 - worked once
  8. Failed
  9. Failed
  10. Failed
  11. Failed
  12. Change: Started five sequential checksum checks of the ISO with: fciv .\Windows_Server_2016_Datacenter_EVAL_en-us_14393_refresh.ISO - worked once
  13. Worked
  14. Change: Made an infinite loop checksum check .cmd file - Failed
  15. Failed
  16. Worked
  17. Failed
  18. Change: Started four parallel infinite checksum checks (poor SSD - CPU maxed out) - The VM took forever to start - Cancelled one checksum and the VM started - packer boot_command had timed out - Failed
  19. Change: Three parallel infinite checksum checks - Worked, I did see a boot manager shortly before the install started
  20. Change: Two parallel infinite checksum checks - Worked
  21. Worked
  22. Worked
  23. Worked
  24. Worked
  25. Worked
  26. Worked
  27. Tested my original template - still with two infinite checksum checks running - Worked
  28. Worked
  29. Worked

Tested with EVAL ISO (in form of https://link) - works 100% time on SSD - boot manager from iso waits LONGER.

All the above tests were made with the evaluation ISO except for attempt 27-29.
Evaluation ISO: https://www.microsoft.com/en-us/evalcenter/evaluate-windows-server-2016

Hmmm... now it behaves even more funny (both isos).
I can see PXE menu for 10-15 seconds. Packer still waits for WINRM (pass afterboot menu keystroke).
Then out of the blue, when PXE fails, VM reboots to ISO DVD and continues deployment (succesfuly)

Did you change the boot order? It sound like your NIC is above the Install ISO.

Nope. DVD still first.(iso), harddrive, network adapter, secondary iso.
BUT I've enabled PackerDebug.

The same with packer_log=0. It takes aprox 1 minute to timeout PXE then it just continues.

I wonder if anyone else has this problem. I started using packer last week as I'm going to build multiple base templates using packer for hyper-v and soon VMWare.

My next step is to get Ansible to interact with our hyper-v clusters and build VMs using the packer templates. I'm currently testing everything locally but will eventually move everything to a dedicated server. I guess when I move to a virtual server with hyper-v installed, then I probably won't run into this issue anymore.

When my build fails I see this for about 1 minute:
image

Then this - it stays here until I stop the VM:
image

W10 1803 here, but I can test it into hv2019 cluster.

1809 here. I'll build a dedicated packer server and start testing on that instead.

Yup, we'll compare notes.

@marcinbojko Thank you very very much for your help and feedback on this issue, I really appreciate it. I guess this "issue" is out of the hands of the packer developers as it's not really a packer coding issue.

You're very welcome. I've started to test on 2019 end soon we'll know more.

@KimRechnagel , @SwampDragons - I'd like to confirm what's been said: on W2019 (me) and w10 (Kim) packer is unable to boot from DVD if run on quite fast storage. In my case it was S2D built completly on SSD. I've used packer 1.3.5 from issue "spaces in switch name" as my switch does have a spaces in name.

This is very frustrating but I don't think there's anything Packer can do about this; googling shows that this "windows moves through the boot screen too fast on gen 2 vms" issue exists for people who aren't using Packer, too. I'm going to mark this as an upstream bug and close, but if anyone has any good ideas for reliable workarounds, I'd love to add them to the documentation.

Thanks for all your help @marcinbojko.

@SwampDragons sorry for answering to closed issue - I'd like to try aproach with -AutomaticStartDelay passed to New-VM or Set-Vm. So the sequence would be: run vmconnect and WAIT for VM to start.
The problem is i have absolutely no clue about Golang. If it's not too much can you point me to a piece of code that builds or sets 'new-vm' or 'set-vm' part?

Ah, sorry; didn't realize you were thinking of adding this option. The powershell scripts that comprize the hyperv driver are here, and the new-vm code specifically is here

The new-vm code uses golang templating to produce a minimal powershell script and allow us to work around passing a ton of parameters into our Powershell call.

@marcinbojko I tested your template on a standalone physical Dell Poweredge 815 Hyper-V 2012 R2 host with local harddrives. The funny thing is that I see the same behavior as you. The VM starts, I see the "Press any key" prompt for maybe 3-4 seconds (Packer seems to be connected here), then the VM goes into PXE boot, times out after 60 seconds goes back to "Press any key" and THEN starts the installation.

2012/2016, windows 10 up to 1803. W10 1809/2019=packer unusable.

@marcinbojko Just an update. I have built a nested hyper-v host on a hyper-v 2016 cluster. I have used my original 1809 ISO from the VLSC site as well as the evaluation ISO and so far I have not had any issues with packer connecting too slow. It seems like vmconnect.exe connects way faster in my current setup, so missing the boot_command is not an issue.

Interesting. DNS issues?

I don't think so, as it wouldn't make sense if vmconnect relies on DNS to lookup local VMs.

I suppose your nested HV is a standalone, out of AD host?

Yes, it's a standalone hyper-v host. Its only purpose is building packer templates which I'm going to use on an Ansible server elsewhere in our infrastructure. The packer hyper-v host is a member of an AD and use the AD DNS servers. The Hyper-V host is also a DHCP server exclusively for the packer templates.

I have tried to start up in headless mode but the VM still starts too fast.

I _wish_ I had this bug. It takes 48 hours for all of my Hyper-V images to build. Sometimes longer if any of the builds get stuck at the "Waiting of ssh access stage."

In regards to a solution, I had a couple of ideas you could try (I obviously don't have the hardware to do so myself)... namely, try adding PXE (aka the Network Boot option) to your boot order. That might buy the seconds you need. You can modify this chunk of packer code to include the network boot option. Just update the powershell command packer is using.

A more reliable kludge might be booting the machine without an ISO (or using a non-bootable dummy ISO if it's needed to ensure a DVD drive is provisioned)... and then waiting for the new guest to reach the UEFI error screen (shown below). From there you can mount the actual installation ISO, and with a sufficient delay, have packer start the boot command/install process by first triggering a reboot, (aka press any key) onto the freshly mounted ISO.

If the latter idea works, (aka if you test it by mounting/swapping the ISO manually), then we know a potential fix, and the focus could shift to making packer use this strategy.

screenshot from 2019-02-28 18-04-30

I've been encountering this same issue and wanted to detail something that appears to be working for me.

First off my environment:

  • Host: windows 10 version 1809
  • HW: pcie SSD
  • Packer v1.3.5
  • Powershell 5.1
  • OS: Windows Server 2019 Eval ( from Microsoft website )
  • Generate 2, verified DVD as first boot option
  • NOT headless. vmconnect running

Issue:
As others have mentioned, when the VM is first booted, it runs through the boot options before packer ever connects leaving you stuck at this screen:

image

note: This screen is actually frozen. It took me a while to realize this but if you've made it this far, packer can NO longer send boot commands. I kept trying to send ctrl+alt+del/ctrl+shift+alt+del to try and reboot the machine and nothing was happening.

Solution

The solution for me which feels very much like a hack was to wait until the frozen boot screen times out. ( about 60s ). After that, this error screen will launch:

image

The good new is, once you get to the error screen, packer can start sending boot commands again so from here you just need to tab down and press enter to restart.

Here's the code:

            "boot_wait": "70s",
            "boot_command": [
                "<tab><wait><enter><wait>",
                "a<wait>a<wait>a<wait>a<wait>a<wait>a<wait>"
            ],

why not create an image with the efisys_noprompt.bin file and skip this 'press any key' entirely. Works for all hypervisors. Don't forget to get a new hash (Get-FileHash)

We've got a PR containing a potential fix for this, if anyone is up for testing it out:

https://circleci.com/gh/hashicorp/packer/9097#artifacts/containers/0

@SwampDragons - this worked for me.

@SwampDragons Thank you, this works for us also!
Do you know when 1.4.4 will be released?

Next week, probably Tuesday. :)

@SwampDragons, thank you for your hard work! Packer 1.4.4 released and can be found https://packer.io/downloads.html page.
Will it be released at https://github.com/hashicorp/packer/releases?
Will docker images be updated at hub.docker.com?

Ah, I will fix that link. However, 1.4.4 had a critical HyperV bug that we didn't catch until just after releasing -- you should probably use the nightly build that is already linked there.

Oh, I think i was my doing. I was fixing this broken intendation and it slipped somehow.

@marcinbojko It's my fault for not catching it in tests; it's already fixed on the master branch, so no worries.

I'm going to lock this issue because it has been closed for _30 days_ โณ. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

s4mur4i picture s4mur4i  ยท  3Comments

sourav82 picture sourav82  ยท  3Comments

mushon4 picture mushon4  ยท  3Comments

craigsimon picture craigsimon  ยท  3Comments

Nikoos picture Nikoos  ยท  3Comments