We are running Packer as a Docker container from hashicorp/packer
image. Packer is provisioning Azure VM with Windows OS. Container with Packer is run as Azure Container Instance (ACI). Both container and VM are in the same region. We are getting a lot of azure-arm: unexpected EOF
errors with the stack trace as in gist. I'd say success rate of making the end of Packer build is 1 of 5.
1.4.3 and 1.4.4.
Dockerfile
for building container with Packer: https://github.com/appveyor/build-images/blob/master/Dockerfile
Packer build file: https://github.com/appveyor/build-images/blob/master/vs2019.json
Packer run as container from hashicorp/packer:1.4.3
and hashicorp/packer:1.4.4
. Container is run on Azure Container Instances.
https://gist.github.com/FeodorFitsner/1d9d834c6ee4dd9db428ab3e7f8eb1ff
Thanks for opening. I have a couple of follow up questions for you:
Is this happening exclusively with the Azure builder? I see in your template you have google and amazon builders as well.
Is this regularly happening in a specific place in your build (i.e. a specific provisioner, provisioner type, or script?) or does it seem to be random where it happens during the build run?
Does it seem to happen when uploading things of a specific size?
Does it seem to happen after a certain amount of time has elapsed during the build?
We are running that build mostly on Azure and Hyper-V now - I haven't tried it yet on AWS and GCE. However, on Hyper-V it's not happening so often.
That's interesting that it usually fails in these two places:
More often:
[1;32m==> azure-arm: Provisioning with powershell script: /build-images/scripts/Windows/install_powershell_get.ps1[0m
[1;31mBuild 'azure-arm' errored: unexpected EOF[0m
Less often:
[1;32m==> azure-arm: Provisioning with powershell script: /build-images/scripts/Windows/install_wsl.ps1[0m
[1;31mBuild 'azure-arm' errored: unexpected EOF[0m
Link to the build script: https://github.com/appveyor/build-images/blob/master/vs2019.json#L88
And yes, both scripts come in the beginning of the build. Usually, if it passes that point there is 95% chance the build would be successful (the entire build runs between 5-6 hours).
Not sure about uploading part of the question. Could you please elaborate?
Thanks in advance for giving me any clue what might be wrong and how to make it more stable.
Just ran and got the same place again:
[1;32m==> azure-arm: Provisioning with powershell script: /build-images/scripts/Windows/install_powershell_get.ps1[0m
[1;31mBuild 'azure-arm' errored: unexpected EOF[0m
==> Some builds didn't complete successfully and had errors:
--> azure-arm: unexpected EOF
==> Builds finished but no artifacts were created.
panic: runtime error: invalid memory address or nil pointer dereference
2019/11/08 20:28:34 packer: [signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x6ab3e6]
2019/11/08 20:28:34 packer:
In both cases you shared, the crash is happening while we're using WinRM to upload your powershell scripts to your vm.
I was just wondering if this is some kind of size buffer situation where this is happening with larger files, but at a whopping 289 bytes I guess that's not the issue 😂
This looks like it's happening within the WinRM library we use, and is probably a duplicate of #7350 and possibly #8229. I've never been able to reproduce these intermittent errors from my own setup. I'll try from the packer docker container and see if I can do that... maybe there's something going on with resource constraints inside a container.
Resource constraints could be the reason. Right now it's container with 2 CPU cores and 4 GB of ram, however I remember there were more crashes with 2 GB of RAM. I'm going to try a different container size.
That sure feels like it ought to be enough, but worth verifying.
I don't have much new to add yet, but I think these four issues are all duplicates of each other:
@FeodorFitsner I think I've tracked down this bug. The patched build here should solve it https://circleci.com/gh/hashicorp/packer/21445#artifacts/containers/0; this comes from the PR linked above.
Fantastic, I'm going to give it a try! Thank you for not giving up on this issue!
While trying the artifact from the build I noticed that packer_windows_amd64.zip
archive contain pkg\packer_windows_amd64
file inside which I assumed should be renamed to packer.exe
. Also, the resulting executable is like 40 MB less than packer.exe
from official distro and it's slower to start. It's expected behavior for these build artifacts, right?
I'm not sure why it would be slower to start, but the size difference doesn't surprise me.
I've just tested the patched packer and it worked like a charm! It was run from the container with 1 core and 2 GB of memory while provisioning Windows instance on Azure. I ran 3 jobs in a row 5 hours each and they all finished successfully! Thanks for fixing that!
I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.
Most helpful comment
I've just tested the patched packer and it worked like a charm! It was run from the container with 1 core and 2 GB of memory while provisioning Windows instance on Azure. I ran 3 jobs in a row 5 hours each and they all finished successfully! Thanks for fixing that!