Terraform-provider-aws: Have Terraform stop to-be-destroyed instance first when moving attached ebs volume

Created on 27 Oct 2017 · 6Comments · Source: hashicorp/terraform-provider-aws

When destroying an instance, and then moving an attached volume to another instance, it would be nice if Terraform could send a Stop to the instance being destroyed first.

This is because often the volume doesn't detach in time and I end up just stopping the old instance and re-running the apply. The old instance is getting destroyed anyway.

Terraform Version

10.7 and before

Affected Resource(s)

Please list the resources as a list, for example:
volume_attachment

Panic Output

1 error(s) occurred:

module.sre_apps.aws_volume_attachment.az1_collector_data (destroy): 1 error(s) occurred:
aws_volume_attachment.az1_collector_data: Error waiting for Volume (vol-0ec7ae950309d165a) to detach from Instance: i-058b7f34f0a545f77

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error

Expected Behavior

Volume moves to other instance quickly

Actual Behavior

What actually happened?

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

Move an EBS volume from one instance to another.

enhancement servicec2 stale

Source

cnoffsin

👍18

Most helpful comment

My eventual solution, but only works for us because the instance and volume are both ephemeral and will always be created/destroyed together.

resource "aws_volume_attachment" "volume_attachment" {
  device_name  = "${module.common.device_name_label}"
  volume_id    = "${var.volume_id}"
  instance_id  = "${module.instance.id}"

  # Fix for https://github.com/terraform-providers/terraform-provider-aws/issues/2084
  provisioner "remote-exec" {
    inline     = ["sudo poweroff"]
    when       = "destroy"
    on_failure = "continue"

    connection {
      type        = "ssh"
      host        = "${module.instance.private_ip}"
      user        = "${lookup(module.common.linux_user_map, var.os)}"
      private_key = "${file("${var.local_key_file}")}"
      agent       = false
    }
  }

  # Make sure instance has had some time to power down before attempting volume detachment
  provisioner "local-exec" {
    command = "sleep 30"
    when    = "destroy"
  }
}

duality72 on 10 Nov 2017

👍3

All 6 comments

There are a bunch of tickets about this problem:
https://github.com/terraform-providers/terraform-provider-aws/search?q=%22Error+waiting+for+Volume%22&type=Issues
And also https://github.com/hashicorp/terraform/issues/2957 which is closed but still relevant I think.

The problem seems to be that the EBS volume is being detached from an EC2 instance, while it's still mounted.

Shutting down the EC2 instance first is probably the only sane thing to do in general. I can do that by adding a destroy-provisioner to the EBS-attachment:

  provisioner "remote-exec" {
    when = "destroy"
    inline = "sudo poweroff"
  }

But there's no way to start the instance again. So when changing the attachment of an EC2 instance from EBS volume A to volume B I then get this error:

Error waiting for instance (i-07416cee66e784c04) to become ready: unexpected state 'stopped', wanted target 'running'. last error: %!s(<nil>)

Not sure how the terraform provider could handle this. #569 suggests to add a new aws_instance_state resource, maybe it would help.

Probably best would be if terraform-aws would stop the instance before detaching, and start the instance again before attaching, both using the AWS APIs.

njam on 28 Oct 2017

My eventual solution, but only works for us because the instance and volume are both ephemeral and will always be created/destroyed together.

resource "aws_volume_attachment" "volume_attachment" {
  device_name  = "${module.common.device_name_label}"
  volume_id    = "${var.volume_id}"
  instance_id  = "${module.instance.id}"

  # Fix for https://github.com/terraform-providers/terraform-provider-aws/issues/2084
  provisioner "remote-exec" {
    inline     = ["sudo poweroff"]
    when       = "destroy"
    on_failure = "continue"

    connection {
      type        = "ssh"
      host        = "${module.instance.private_ip}"
      user        = "${lookup(module.common.linux_user_map, var.os)}"
      private_key = "${file("${var.local_key_file}")}"
      agent       = false
    }
  }

  # Make sure instance has had some time to power down before attempting volume detachment
  provisioner "local-exec" {
    command = "sleep 30"
    when    = "destroy"
  }
}

duality72 on 10 Nov 2017

👍3

@duality72 can I use your solution for instances in private subnets?

GarrisonD on 15 Dec 2017

@GarrisonD very late reply here, but I don't see why not

duality72 on 1 Jun 2018

Marking this issue as stale due to inactivity. This helps our maintainers find and focus on the active issues. If this issue receives no comments in the next 30 days it will automatically be closed. Maintainers can also remove the stale label.

If this issue was automatically closed and you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thank you!