I'm going to start off by saying this is a sporadic issue. I will go days with this happening / not happening. I'm trying _NOT_ to use the ssh-agent
When my machine tries to do a simple ssh connection with packer shell provisioner I get the following error on repeat until timeout
2018/01/29 15:26:37 packer: 2018/01/29 15:26:37 [INFO] Attempting SSH connection...
2018/01/29 15:26:37 packer: 2018/01/29 15:26:37 reconnecting to TCP connection for SSH
2018/01/29 15:26:38 packer: 2018/01/29 15:26:38 handshaking with SSH
2018/01/29 15:26:39 packer: 2018/01/29 15:26:39 handshake complete!
2018/01/29 15:26:39 packer: 2018/01/29 15:26:39 opening new ssh session
2018/01/29 15:27:25 packer: 2018/01/29 15:27:25 ssh session open error: 'ssh: unexpected packet in response to channel open: <nil>', attempting reconnect
If I run with with -debug
I'm able to manually SSH in while it gets the above error
According to
The Documentation
ssh_agent_auth (boolean) - If true, the local SSH agent will be used to authenticate connections to the remote host. Defaults to false.
I killed my ssh-agent and then ran and got the following error on repeat
==> amazon-ebs: Connected to SSH!
2018/01/29 15:38:36 packer: 2018/01/29 15:38:36 Running the provision hook
2018/01/29 15:38:36 [INFO] (telemetry) Starting provisioner shell
2018/01/29 15:38:36 ui: ==> amazon-ebs: Provisioning with shell script: ubuntu/prepare_copy_files.sh
==> amazon-ebs: Provisioning with shell script: ubuntu/prepare_copy_files.sh
2018/01/29 15:38:36 packer: 2018/01/29 15:38:36 Opening ubuntu/prepare_copy_files.sh for reading
2018/01/29 15:38:36 packer: 2018/01/29 15:38:36 [INFO] 134 bytes written for 'uploadData'
2018/01/29 15:38:36 [INFO] 134 bytes written for 'uploadData'
2018/01/29 15:38:36 packer: 2018/01/29 15:38:36 opening new ssh session
2018/01/29 15:39:24 packer: 2018/01/29 15:39:24 ssh session open error: 'ssh: unexpected packet in response to channel open: <nil>', attempting reconnect
2018/01/29 15:39:24 packer: 2018/01/29 15:39:24 reconnecting to TCP connection for SSH
2018/01/29 15:39:29 packer: 2018/01/29 15:39:29 handshaking with SSH
2018/01/29 15:39:30 packer: 2018/01/29 15:39:30 handshake complete!
2018/01/29 15:39:30 packer: 2018/01/29 15:39:30 [ERROR] could not connect to local agent socket: /var/folders/r2/28gl3fn17ylcmcqz_jyfskw80000gn/T//ssh-Rq5THrgVebEN/agent.927
2018/01/29 15:40:17 packer: 2018/01/29 15:40:17 Retryable error: Error uploading script: ssh: unexpected packet in response to channel open: <nil>
How can I get the ssh connection to actually work?
packer version
Debug log output from PACKER_LOG=1 packer build template.json
.
https://gist.github.com/thebenwaters/43bec3ff381e5842163a6d23bfb668d0
The _simplest example template and scripts_ needed to reproduce the bug.
Include these in your gist https://gist.github.com
https://gist.github.com/thebenwaters/10d3d25ee61c04d0ba3809a6b85498ea
I also read the following issues #4363 #4373
I have tried by both
1) Killing my ssh-agent
2) unset SSH_AUTH_SOCK
3) unset SSH_AGENT_PID
It is able to "connect to ssh" but then dies
2018/01/29 15:56:05 ui: ==> amazon-ebs: Provisioning with shell script: ubuntu/prepare_copy_files.sh
==> amazon-ebs: Provisioning with shell script: ubuntu/prepare_copy_files.sh
2018/01/29 15:56:05 packer: 2018/01/29 15:56:05 Opening ubuntu/prepare_copy_files.sh for reading
2018/01/29 15:56:05 packer: 2018/01/29 15:56:05 opening new ssh session
2018/01/29 15:56:05 packer: 2018/01/29 15:56:05 [INFO] 134 bytes written for 'uploadData'
2018/01/29 15:56:05 [INFO] 134 bytes written for 'uploadData'
2018/01/29 15:56:51 packer: 2018/01/29 15:56:51 ssh session open error: 'ssh: unexpected packet in response to channel open: <nil>', attempting reconnect
2018/01/29 15:56:51 packer: 2018/01/29 15:56:51 reconnecting to TCP connection for SSH
2018/01/29 15:56:54 packer: 2018/01/29 15:56:54 handshaking with SSH
2018/01/29 15:56:55 packer: 2018/01/29 15:56:55 handshake complete!
I'm using the zsh plugin ssh-agent
OpenSSH_7.4p1, LibreSSL 2.5.0
My coworker is able to running with
Packer v1.1.3
OSX 10.11.6
OpenSSH_6.9p1, LibreSSL 2.1.8
With no ssh-agent and is able to run the same build script just fine
@thebenwaters do you still have this problem?
@rickard-von-essen on newer versions no. I'll go ahead and close for now
I seem to have this or at least a similar problem.
My log excerpt:
2019/01/28 17:15:17 packer.exe: 2019/01/28 17:15:17 [DEBUG] SSH handshake err: Timeout during SSH handshake
2019/01/28 17:15:24 packer.exe: 2019/01/28 17:15:24 [INFO] Attempting SSH connection...
2019/01/28 17:15:24 packer.exe: 2019/01/28 17:15:24 [DEBUG] reconnecting to TCP connection for SSH
2019/01/28 17:15:24 packer.exe: 2019/01/28 17:15:24 [DEBUG] handshaking with SSH
2019/01/28 17:16:07 packer.exe: 2019/01/28 17:16:07 [DEBUG] SSH handshake err: ssh: handshake failed: read tcp 127.0.0.1:62580->127.0.0.1:4001: wsarecv: Eine vorhandene Verbindung wurde vom Remotehost geschlossen.
2019/01/28 17:16:14 packer.exe: 2019/01/28 17:16:14 [INFO] Attempting SSH connection...
2019/01/28 17:16:14 packer.exe: 2019/01/28 17:16:14 [DEBUG] reconnecting to TCP connection for SSH
2019/01/28 17:16:14 packer.exe: 2019/01/28 17:16:14 [DEBUG] handshaking with SSH
2019/01/28 17:16:15 packer.exe: 2019/01/28 17:16:15 [DEBUG] handshake complete!
2019/01/28 17:16:15 packer.exe: 2019/01/28 17:16:15 [ERROR] could not connect to local agent socket: C:/Users/danowar/AppData/Local/Temp/ssh-4XmoKGTqHi7T/agent.6236
2019/01/28 17:16:15 ui: ==> virtualbox-iso: Connected to SSH!
2019/01/28 17:16:15 packer.exe: 2019/01/28 17:16:15 VBoxManage --version output: 5.2.24r128163
2019/01/28 17:16:15 packer.exe: 2019/01/28 17:16:15 VirtualBox version: 5.2.24
2019/01/28 17:16:15 ui: ==> virtualbox-iso: Uploading VirtualBox version info (5.2.24)
2019/01/28 17:16:15 packer.exe: 2019/01/28 17:16:15 [DEBUG] Opening new ssh session
2019/01/28 17:16:16 packer.exe: 2019/01/28 17:16:16 [DEBUG] Starting remote scp process: scp -vt .
2019/01/28 17:16:16 packer.exe: 2019/01/28 17:16:16 [DEBUG] Started SCP session, beginning transfers...
2019/01/28 17:16:16 packer.exe: 2019/01/28 17:16:16 [DEBUG] Copying input data into temporary file so we can read the length
2019/01/28 17:16:16 packer.exe: 2019/01/28 17:16:16 [DEBUG] scp: Uploading .vbox_version: perms=C0644 size=6
2019/01/28 17:16:17 ui error: ==> virtualbox-iso: Error uploading VirtualBox version: scp: ./.vbox_version: Permission denied
2019/01/28 17:16:17 ui: ==> virtualbox-iso: Step "StepUploadVersion" failed
2019/01/28 17:16:17 ui: ask: ==> virtualbox-iso: [c] Clean up and exit, [a] abort without cleanup, or [r] retry step (build may fail even if retry succeeds)?
Notes:
However, it seems that the SSH agent is tried regardless. I don't know what the value of ssh_agent_auth is at that moment (because debugging doesn't seem to provide that info).
Looking in the code repository...
https://github.com/hashicorp/packer/blob/43c30d6b7fba8d459c5b48c0a640da307e0e0e28/helper/communicator/config.go#L39
The config value is not initialized, I think. That probably means it defaults to false (like every boolean Packer switch, it seems).
This value is only used in the function SSHConfigFunc:
https://github.com/hashicorp/packer/blob/43c30d6b7fba8d459c5b48c0a640da307e0e0e28/helper/communicator/config.go#L99
Which seems to be used by most/all builders to build their SSH configuration object. Okay...
This is the function where the reconnect via the TCP connection is tried...
Here the handshake is complete...
Nowhere here is it checked whether we should use an existing SSH agent or not. It always tries to use it.
And inside...
It always looks for the SSH_AUTH_SOCK environment variable and then tries to connect via "unix".
Have I mentioned I'm using Packer in Windows? ;-)
So, to summarize, at the moment the option ssh_agent_auth is useless, because it isn't respected anyway and an existing SSH agent is always used. Which doesn't seem to work sometimes, at least in Windows.
Further digging revealed that after printing the intention
and before uploading the file (?), Packer checks if a last SCP operation was done without error...
I have no idea what is the "last scp operation" before trying to upload the VirtualBox version. Maybe it's the failed handshake using the SSH agent?
It seems that it is. I unset the SSH_AUTH_SOCK variable and the error vanishes.
Or not...Another run with an unset SSH_AUTH_SOCK produces the same error. I'll run again with logging enabled. Must be another thing, or how could Packer detect an SSH agent without that variable?
Log excerpts:
2019/01/29 10:00:57 packer.exe: 2019/01/29 10:00:57 [INFO] Waiting 1s
2019/01/29 10:00:58 ui: ==> virtualbox-iso: Using ssh communicator to connect: 127.0.0.1
2019/01/29 10:00:58 packer.exe: 2019/01/29 10:00:58 [INFO] Waiting for SSH, up to timeout: 30m0s
2019/01/29 10:00:58 ui: ==> virtualbox-iso: Waiting for SSH to become available...
Followed by lots of
2019/01/29 10:00:58 packer.exe: 2019/01/29 10:00:58 [INFO] Attempting SSH connection...
2019/01/29 10:00:58 packer.exe: 2019/01/29 10:00:58 [DEBUG] reconnecting to TCP connection for SSH
2019/01/29 10:00:58 packer.exe: 2019/01/29 10:00:58 [DEBUG] handshaking with SSH
2019/01/29 10:00:58 packer.exe: 2019/01/29 10:00:58 [DEBUG] SSH handshake err: ssh: handshake failed: read tcp 127.0.0.1:56942->127.0.0.1:2849: wsarecv: Eine vorhandene Verbindung wurde vom Remotehost geschlossen.
and a few of these
2019/01/29 10:06:06 packer.exe: 2019/01/29 10:06:06 [INFO] Attempting SSH connection...
2019/01/29 10:06:06 packer.exe: 2019/01/29 10:06:06 [DEBUG] reconnecting to TCP connection for SSH
2019/01/29 10:06:06 packer.exe: 2019/01/29 10:06:06 [DEBUG] handshaking with SSH
2019/01/29 10:06:06 packer.exe: 2019/01/29 10:06:06 [DEBUG] SSH handshake err: ssh: handshake failed: read tcp 127.0.0.1:58199->127.0.0.1:2849: wsarecv: Eine bestehende Verbindung wurde softwaregesteuert
2019/01/29 10:06:06 packer.exe: durch den Hostcomputer abgebrochen.
Finally,...
2019/01/29 10:16:15 packer.exe: 2019/01/29 10:16:15 [INFO] Attempting SSH connection...
2019/01/29 10:16:15 packer.exe: 2019/01/29 10:16:15 [DEBUG] reconnecting to TCP connection for SSH
2019/01/29 10:16:15 packer.exe: 2019/01/29 10:16:15 [DEBUG] handshaking with SSH
2019/01/29 10:17:15 packer.exe: 2019/01/29 10:17:15 [DEBUG] SSH handshake err: Timeout during SSH handshake
2019/01/29 10:17:22 packer.exe: 2019/01/29 10:17:22 [INFO] Attempting SSH connection...
2019/01/29 10:17:22 packer.exe: 2019/01/29 10:17:22 [DEBUG] reconnecting to TCP connection for SSH
2019/01/29 10:17:22 packer.exe: 2019/01/29 10:17:22 [DEBUG] handshaking with SSH
2019/01/29 10:18:06 packer.exe: 2019/01/29 10:18:06 [DEBUG] handshake complete!
2019/01/29 10:18:06 packer.exe: 2019/01/29 10:18:06 [INFO] no local agent socket, will not connect agent
2019/01/29 10:18:06 ui: ==> virtualbox-iso: Connected to SSH!
2019/01/29 10:18:06 packer.exe: 2019/01/29 10:18:06 VBoxManage --version output: 5.2.24r128163
2019/01/29 10:18:06 packer.exe: 2019/01/29 10:18:06 VirtualBox version: 5.2.24
2019/01/29 10:18:06 ui: ==> virtualbox-iso: Uploading VirtualBox version info (5.2.24)
2019/01/29 10:18:06 packer.exe: 2019/01/29 10:18:06 [DEBUG] Opening new ssh session
2019/01/29 10:18:07 packer.exe: 2019/01/29 10:18:07 [DEBUG] Starting remote scp process: scp -vt .
2019/01/29 10:18:07 packer.exe: 2019/01/29 10:18:07 [DEBUG] Started SCP session, beginning transfers...
2019/01/29 10:18:07 packer.exe: 2019/01/29 10:18:07 [DEBUG] Copying input data into temporary file so we can read the length
2019/01/29 10:18:07 packer.exe: 2019/01/29 10:18:07 [DEBUG] scp: Uploading .vbox_version: perms=C0644 size=6
2019/01/29 10:18:08 ui error: ==> virtualbox-iso: Error uploading VirtualBox version: scp: ./.vbox_version: Permission denied
2019/01/29 10:18:08 ui: ==> virtualbox-iso: Step "StepUploadVersion" failed
2019/01/29 10:18:08 ui: ask: ==> virtualbox-iso: [c] Clean up and exit, [a] abort without cleanup, or [r] retry step (build may fail even if retry succeeds)?
@danowar2k Can you open an new issue. This seems broken by looking on the code.
Hello @danowar2k, I also would like to help you but the information about this issue is a bit hard for me to gather and we have multiple thinks going on on packer to fix/improve right now. I'm repeating what @rickard-von-essen already said, but would it be okay for you to give us a simple way to reproduce this one, preferably in a new issue ? I hope I am not asking too much ! I have a windows machine ready for tests here.
Thanks 🙂
I will open a new issue. I could provide you with a Packer template or rather the contents of a git repository which include a Packer template and scripts (not yet documented very much). The only thing you would have to get for yourself is the Solaris 11.4 AI iso (they have a Text installer ISO and an AI ISO on their site.) You have to only register an Oracle account to download. Anyway. will repeat that in the new issue...
I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.