Terraform v0.9.2
resource "openstack_compute_instance_v2" "instance" {
count = "${var.instances}"
name = "${var.name}-instance-${count.index + 1}"
image_id = "${var.disk_image}"
flavor_name = "${var.machine_type}"
key_pair = "${var.key_name}"
network {
name = "${var.net}"
access_network = true
}
security_groups = [
"${openstack_compute_secgroup_v2.allow-ssh.name}",
]
}
resource "openstack_compute_floatingip_associate_v2" "floatingip_assoc" {
floating_ip = "${openstack_networking_floatingip_v2.floatingip.address}"
instance_id = "${openstack_compute_instance_v2.instance.id}"
}
resource "null_resource" "provision" {
depends_on = ["openstack_compute_floatingip_associate_v2.floatingip_assoc"]
provisioner "remote-exec" {
connection {
user = "core"
host = "${openstack_networking_floatingip_v2.floatingip.address}"
}
inline = [
"docker run -d ${var.docker_ports} ${var.docker_repository}",
"echo terraform executed > /tmp/foo",
]
}
}
TF executes the provisioner, reporting on screen what's going on.
TF executes the provisioner, but outputs an abnormal amount of
[0m[0mopenstack_compute_instance_v2.instance (remote-exec): Connecting to remote host via SSH...
openstack_compute_instance_v2.instance (remote-exec): Host: 193.62.52.106
openstack_compute_instance_v2.instance (remote-exec): User: core
openstack_compute_instance_v2.instance (remote-exec): Password: false
openstack_compute_instance_v2.instance (remote-exec): Private key: false
openstack_compute_instance_v2.instance (remote-exec): SSH Agent: true
while waiting on SSH to become available. Piping the logs on a file we ended up with 25MB of logs to provision a single instance. Same provisioning with TF 0.7.13 writes ~500KB.
Please list the steps required to reproduce the issue, for example:
terraform applyNobody with a bit of love for this issue? 👍
Output is coming from the ssh communicator: https://github.com/hashicorp/terraform/blob/8271a739f23a2bd1bcf4be6739b7e91a0caab332/communicator/ssh/communicator.go#L114-L125
Which remote-exec is calling in a loop until timeout is reached: https://github.com/hashicorp/terraform/blob/master/builtin/provisioners/remote-exec/resource_provisioner.go#L156-L158
I've been struggling for hours to figure out how to make this quiet - tried all manner of settings of TF_LOG but nothing seems to matter. Indeed, looking at the code now it does not seem possible to quiet this output as it does not seem to have a log level and appears to be bypassing the logging infrastructure to output directly to the user via the UIOutput mechanism.
Incidentally, I don't think the fix here would ideally be to _silence_ this logging - I would actually like to see at least the basic information. The problems here are that:
Getting the logging down to a single line would be great. Something like:
openstack_compute_instance_v2.instance (remote-exec): Connecting via SSH: [email protected] (password: false, key: false, agent: true)
Having a 1s sleep between connection attempts should completely eliminate the problem. Implementing some backoff between retries (as suggested in a comment in retryFunc: https://github.com/hashicorp/terraform/blob/master/builtin/provisioners/remote-exec/resource_provisioner.go#L237) could make it even better.
The issue here is two-fold, and I don't think it should be fixed by any changes to output exclusively.
First, at some point during 0.9 development, it appears the back-off stopped working. This makes it so that Terraform is retrying _way too often_ and results in a _lot_ of output coming out. Realistically, we should be doing a fixed or even exponential backoff up to some reasonable fixed value. This would lower the output by orders of magnitude.
Second, the output can likely be improved to be a bit more "smart". We should probably only output the full connection attempt info once, and then on the follow-up attempts just put a single line of "Still attempting to connect via SSH...". This will naturally be prefixed with the resource which is saying that so we don't need to really be more specific.
These are two separate issues but together will completely resolve this issue. Neither of these are particularly difficult to address.
Addressing the output directly is only treating the symptom and not the actual problem here. It is understandable to see that as the issue but it is only the emergent behavior of a deeper (but not much more complicated) underlying issue.
I agree that those two changes should completely address this issue.
no worries. i figured i'd throw some drivebys at it, and at least quiet it on my box :).
i figured the backoff would be trivial at first, then saw what was going on and then was confused, but managed to find the already pulled-in backoff dep, so tacked it on.
i couldn't find backoff in the existing retry stuff, but then again, i didn't really dive in either.
the actual implementation of the retry/backoff/connection stuff is a little bit non-trivial.
i'm probably going to drop it and keep my local patch for a while. idk if this backoff patch is a good bandaid or not, probably because i'm not sure how it originally was supposed to work.
godspeed.
I ran into this too. I had to revert from v0.9.6 to v0.8.8 because of this issue.
As far as I can tell, the main problem is the first point mentioned by @mitchellh . In v0.9.6 I saw much more output when trying to establish a ssh connection. This even filled 4.5 GB of disk space. (I realised that because suddenly I had 100 percent disk usage.) I don't know how that happened but once I killed terraform I gained back the 4.5 GB.
I would very much appreciate reducing the amount of connection retries, as already suggested. I can also imagine to implement some type of "sleep" parameter for connection. So, I could tell terraform to wait for some time before establishing the first connection.
How long until we see a fix for this in the 0.9.x branch? We are taking up a lot of space with log files due to this now.
I seems, that this issue is fixed. I testet with terraform 0.10.3.
I seem to be having this issue with terraform version 0.11.1. Im working on debian 9. Using GCP provider. Is there a solution? Can a backoff timer been implemented?
I am going to close this issue due to inactivity and the long time since it was opened. The underlying code has had many changes since this issue was opened.
If there is still a question, I recommend the the community forum, where there are far more people available to help. If there is a bug or you would like to make a feature request, please open a new issue and fill out the template.
Thanks!
I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.
Most helpful comment
The issue here is two-fold, and I don't think it should be fixed by any changes to output exclusively.
First, at some point during 0.9 development, it appears the back-off stopped working. This makes it so that Terraform is retrying _way too often_ and results in a _lot_ of output coming out. Realistically, we should be doing a fixed or even exponential backoff up to some reasonable fixed value. This would lower the output by orders of magnitude.
Second, the output can likely be improved to be a bit more "smart". We should probably only output the full connection attempt info once, and then on the follow-up attempts just put a single line of "Still attempting to connect via SSH...". This will naturally be prefixed with the resource which is saying that so we don't need to really be more specific.
These are two separate issues but together will completely resolve this issue. Neither of these are particularly difficult to address.
Addressing the output directly is only treating the symptom and not the actual problem here. It is understandable to see that as the issue but it is only the emergent behavior of a deeper (but not much more complicated) underlying issue.