Packer: post-processor vsphere-template add parameter for wait for ‘VMware vSphere to start’ bug/feature-request

Created on 27 Feb 2018  ·  30Comments  ·  Source: hashicorp/packer

Packer version: v1.2.1
Host platform: Packer running on Windows 10, ESXi/vCenter 6.5 and ESXi 6.0/vCenter 6.5.

This is a bug report that could be worked around with the simple new feature. The problem appeared in Terraform, trying to clone a VM from a template that had been successfully created by Packer using the builders "vmware-iso" on a Remote vSphere Hypervisor and the post-processors "vsphere-template". See sample Packer template .

Terraform was giving this error:

  • vsphere_virtual_machine.vm: Resource 'data.vsphere_virtual_machine.template' not found for variable 'data.vsphere_virtual_machine.template.id'

This problem always happened on a test ESXi/vCenter 6.5 running on a slower hardware, while it did not happen all the time on a faster hardware running ESXi 6.0/vCenter 6.5. After some investigation, the root cause was identified as being two entries uuid.bios and vc.uuid that were missing from the .vmtx file. Backtracking to the Packer script revealed that sometime around the end of the builder “vmware-iso” and the start of the post processor “vsphere-template”, those entries disappear from the .vmx file, and reappear with different values some seconds later (about 10 seconds on the faster hardware and about 15 seconds on the slower hardware).

Looking at the vsphere-template post-processor code, I noticed a 10 seconds delay when starting the post processor “vsphere-template”. See https://github.com/hashicorp/packer/blob/ace5fb7622ed46b63831d43ecd6d05b58544cf25/post-processor/vsphere-template/post-processor.go#L105

The comments explaining the reason for this delay have slightly changed over time, but have always been vague about the reason for the delay:

Before Jul 10 2017 , https://github.com/hashicorp/packer/commit/3cc9f204acc289e9adbf70c3be087b5c2dd25b8a#diff-2d1af112f5b55ed31686536a6d1b4ac1

           //We give a vSphere-ESXI 10s to sync

Jul 18 2017 , https://github.com/hashicorp/packer/commit/fa10616f57f1801713a70793cb2596967b6bbb32#diff-2d1af112f5b55ed31686536a6d1b4ac1:

           // In some occasions when the VM is mark as template it loses its configuration if it's done immediately
           // after the ESXi creates it. If vSphere is given a few seconds this behavior doesn't reappear.

Aug 14 2017 , https://github.com/hashicorp/packer/commit/81272d1427b5ce0c30fb79d55a1f7618921a8ad4#diff-2d1af112f5b55ed31686536a6d1b4ac1:

           // In some occasions the VM state is powered on and if we immediately try to mark as template 
           // (after the ESXi creates it) it will fail. If vSphere is given a few seconds this behavior doesn't reappear.

I still don’t know what triggers the removal and addition of those uuids, but it seems clear that the reason for the delay is not fully understood. Turning the delay into a parameter could give a workaround for this issue and possibly future issues due to ESXi/vCenter doing things outside of Packer's knowledge.

Thanks,
Georges

post-processovsphere-template upstream-bug

All 30 comments

I am having same error - I tried with PP: vsphere and pp: vsphere-template I get the following error

==> vmware-iso: Connected to WinRM!
==> vmware-iso: Restarting Machine
==> vmware-iso: Waiting for machine to restart...
vmware-iso: A system shutdown is in progress.(1115)
vmware-iso: PKR-WN2K12R2 restarted.
==> vmware-iso: Machine successfully restarted, moving on
==> vmware-iso: Uploading D:\packer2\vmware\windows\dependencies\puppet-agent-1
10.1-x64.msi => C:\Windows\Temp\puppet-agent-install.msi
==> vmware-iso: Restarting Machine
==> vmware-iso: Waiting for machine to restart...
vmware-iso: Reboot complete
vmware-iso: PKR-WN2K12R2 restarted.
==> vmware-iso: Machine successfully restarted, moving on
==> vmware-iso: Gracefully halting virtual machine...
vmware-iso: Waiting for VMware to clean up after itself...
==> vmware-iso: Deleting unnecessary VMware files...
vmware-iso: Deleting: output\vmware.log
==> vmware-iso: Cleaning VMX prior to finishing up...
vmware-iso: Unmounting floppy from VMX...
vmware-iso: Detaching ISO from CD-ROM device...
vmware-iso: Disabling VNC server...
==> vmware-iso: Skipping export of virtual machine (export is allowed only for
SXi and the format needs to be specified)...
==> vmware-iso: Running post-processor: vsphere
vmware-iso (vsphere): Uploading output\windows-base-2012.vmx to vSphere

Am I missing something?

I don't see an error message in the output. Can you explain what the error is?

I don't have a VSphere setup to test on, so this is a tough one for me to do on my own -- however, I've made a change to the post-processor so that it uses a retry mechanism rather than depending on a hardcoded wait time. Mind testing it out for me? I've built PR https://github.com/hashicorp/packer/pull/5981 for windows and attached
packer.zip
it here.

@GMZwinge when you get a chance, can you test out the provided binary and let me know if that solves your issue? The wait is not configurable, but it should retry until it succeeds rather than failing based on a race condition.

@SwampDragons Sorry for the delay. I'll be trying it shortly.

@SwampDragons Unfortunately, it did not help. It seems that the uuids are removed from the .vmx file as soon as the VM powers off, and possibly even before the post-processor vsphere-template has time to start.

I forgot to mention that removing the template from vCenter, adding uuid entries to the .vmtx file, and reregistering the template with vCenter gets rid of the error in Terraform.

Hmm, okay. Then I think we may be solving the wrong problem. Can I have your debug logs (logs from a packer run with the env var PACKER_LOG=1)

Nevermind, I see. We need to check for the uuid before exiting the retry loop.

My original fix was barking up the wrong tree; that added wait was in before packer creates the client, but the issue doesn't appear have anything to do with the client. It looks like the only place the vmx file is used is in RegisterVM (https://github.com/hashicorp/packer/blob/master/post-processor/vsphere-template/step_mark_as_template.go#L80) -- the dsPath.String() points to the datastore path of the vmx. @GMZwinge Is that the vmx which you can see the uuids disappear and then reappear in? If so, I think the right answer is to have a loop where we check for the existence of the UUIDs in this vmx file before we run the RegisterVM command.

@SwampDragons Your original fix at least confirmed my finding that the uuids disappear before the RegisterVM registers is as a template. From looking at the code, the vmx is indeed the one where the uuids disappear and reappear in. Your check for the existence of the uuids before the RegisterVM command should work. But I would restore the original code "Waiting 10s for VMware vSphere to start". There is too much unknown on why that delay was added to remove it. Thanks for coding and building those changes for me to test.

packer.zip
Okay, can you set $env:PACKER_LOG=1 and then run the attached binary, and share logs? I've tried wrapping the templating code in a retryable function.

@SwampDragons Thanks. It didn't seem to make any difference. The log only show an additional entry saying "Registering VM, attempt 1", but no other attempt or wait and the .vmtx still does not contain the two entries "uuid.bios" and "vc.uuid". I forgot so say that the .vmx/.vmtx file does have this entry: uuid.action = "create" in case the retryable function only looks for "uuid" in the .vmx file instead of "uuid.bios" and "vc.uuid".

Interesting... I'd assumed the retryable function would fail a couple of times with your original error message vsphere_virtual_machine.vm: Resource 'data.vsphere_virtual_machine.template' not found for variable 'data.vsphere_virtual_machine.template.id'. I'll have to add something specifically looking for those values.

@SwampDragons Yes. There has to be code looking specifically for those uuids in the .vmx file. I'll gladly give feedback on the code changes. Even though I have no experience programming in Go, it seems relatively easy to read.

I probably did not make something clear enough: The original error message vsphere_virtual_machine.vm: Resource 'data.vsphere_virtual_machine.template' not found for variable 'data.vsphere_virtual_machine.template.id' is seen in Terraform not in Packer. Packer does not give any error at all. But the uuid are missing from the template (.vmtx file). The error message appears when Terraform clones a VM from the template created this way by Packer, or a template where the above two uuid were removed from the .vmtx file.

That definitely makes more sense. I'll upload a binary once I've had a chance to try writing a check.

I am experiencing the same issue having terraform launch vms from packer made templates and have a little workaround that may also work for you,
Just add uuid.bios and vc.uuid with vmx_data_post:

"vmx_data_post": {
        "uuid.bios": "{{ user `uuid_bios` }}",
        "vc.uuid": "{{ user `vc_uuid` }}",
        "uuid.action": "create"
      }

@dannietjoh What value did you gives for uuid_bios and vc_uuid?

@dannietjoh confirmed this worked for me :) Thank you!

@GMZwinge uuid_bios is the one that actually gets used as the template UUID and needs to be unique. You can use {{uuid}} to have packer generate one for you (woot!). I hard coded vc_uuid with the vc.uuid value from a working template's vmtx file. You might be able to use {{uuid}} for that too. If you don't have a template with this value set, you can use the clone template from template functionality in the vcenter UI to create one and it'll be populated correctly.

Is there any commit which add uuid in vmx file since 26 Apr?
Otherwise, why Packer don't want to set uuid?

We want to; I just haven't had a chance to work on it further, yet.

@GMZwinge Did @dannietjoh's fix work for you? I'm considering just generating those values and adding them to the custom data in the postprocessor.

Based on this post, it looks like both vc.uuid and uuid.bios can be unique and generated by Packer.
http://www.virtu-al.net/2015/12/04/a-quick-reference-of-vsphere-ids/

I've created PR # 6650 to test whether the a longer sleep actually works (the original suggestion), which will inform my next steps. @GMZwinge can you try this out?

windows build:
packer.zip

Based on this comment, I'm wondering if this is even a problem anymore: https://github.com/terraform-providers/terraform-provider-vsphere/issues/558#issuecomment-399917503

@SwampDragons The uuids are removed when the VM shuts down and added back about 10-15 seconds later depending on the speed of the hardware or possibly vCenter. On a fast hardware, this is almost the same as the hard coded delay logged as "Waiting 10s for VMware vSphere to start". So the template can have uuid sometime, and not other time. But what @dannietjoh proposed using {{uuid}} also worked for me. Not sure what the consequences are from ESXi/vCenter perspective though. I'll try the packer.zip as soon as I have the time.

I never heard back about that binary; should I close this issue?

@SwampDragons Sorry, I totally forgot about this. I'll try to test it today.

Sorry for the time it took. The attached packer.zip from Aug 31, 2018 does indeed avoid the issue. The uuids are removed then added back before the longer sleep ends.

Looking at terraform-providers/terraform-provider-vsphere#558 (comment), I can tell that we are using the VMware Tools that our vCenter provides for Windows (version 9.10.5) or the CentOS 7.5 CD provides for CentOS (version 10.1.5-3) which are older than the one in the comment (10.2.5) that apparently fixes the issue. But I don't have time to test with the latest VMware Tools.

So given that we have a workaround by adding missing uuids, and a possible fix by using the latest VMware Tools, I would be OK with closing this issue.

Okay, sounds good. Thanks

I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

Was this page helpful?
0 / 5 - 0 ratings