Terraform-provider-aws: Have Terraform stop to-be-destroyed instance first when moving attached ebs volume

Created on 27 Oct 2017  ยท  6Comments  ยท  Source: hashicorp/terraform-provider-aws

When destroying an instance, and then moving an attached volume to another instance, it would be nice if Terraform could send a Stop to the instance being destroyed first.

This is because often the volume doesn't detach in time and I end up just stopping the old instance and re-running the apply. The old instance is getting destroyed anyway.

Terraform Version

10.7 and before

Affected Resource(s)

Please list the resources as a list, for example:
volume_attachment

Panic Output

1 error(s) occurred:

  • module.sre_apps.aws_volume_attachment.az1_collector_data (destroy): 1 error(s) occurred:

  • aws_volume_attachment.az1_collector_data: Error waiting for Volume (vol-0ec7ae950309d165a) to detach from Instance: i-058b7f34f0a545f77

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error

Expected Behavior

Volume moves to other instance quickly

Actual Behavior

What actually happened?

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

  1. Move an EBS volume from one instance to another.
enhancement servicec2 stale

Most helpful comment

My eventual solution, but only works for us because the instance and volume are both ephemeral and will always be created/destroyed together.

resource "aws_volume_attachment" "volume_attachment" {
  device_name  = "${module.common.device_name_label}"
  volume_id    = "${var.volume_id}"
  instance_id  = "${module.instance.id}"

  # Fix for https://github.com/terraform-providers/terraform-provider-aws/issues/2084
  provisioner "remote-exec" {
    inline     = ["sudo poweroff"]
    when       = "destroy"
    on_failure = "continue"

    connection {
      type        = "ssh"
      host        = "${module.instance.private_ip}"
      user        = "${lookup(module.common.linux_user_map, var.os)}"
      private_key = "${file("${var.local_key_file}")}"
      agent       = false
    }
  }

  # Make sure instance has had some time to power down before attempting volume detachment
  provisioner "local-exec" {
    command = "sleep 30"
    when    = "destroy"
  }
}

All 6 comments

There are a bunch of tickets about this problem:
https://github.com/terraform-providers/terraform-provider-aws/search?q=%22Error+waiting+for+Volume%22&type=Issues
And also https://github.com/hashicorp/terraform/issues/2957 which is closed but still relevant I think.

The problem seems to be that the EBS volume is being detached from an EC2 instance, while it's still mounted.

Shutting down the EC2 instance first is probably the only sane thing to do in general. I can do that by adding a destroy-provisioner to the EBS-attachment:

  provisioner "remote-exec" {
    when = "destroy"
    inline = "sudo poweroff"
  }

But there's no way to start the instance again. So when changing the attachment of an EC2 instance from EBS volume A to volume B I then get this error:

Error waiting for instance (i-07416cee66e784c04) to become ready: unexpected state 'stopped', wanted target 'running'. last error: %!s(<nil>)

Not sure how the terraform provider could handle this. #569 suggests to add a new aws_instance_state resource, maybe it would help.

Probably best would be if terraform-aws would stop the instance before detaching, and start the instance again before attaching, both using the AWS APIs.

My eventual solution, but only works for us because the instance and volume are both ephemeral and will always be created/destroyed together.

resource "aws_volume_attachment" "volume_attachment" {
  device_name  = "${module.common.device_name_label}"
  volume_id    = "${var.volume_id}"
  instance_id  = "${module.instance.id}"

  # Fix for https://github.com/terraform-providers/terraform-provider-aws/issues/2084
  provisioner "remote-exec" {
    inline     = ["sudo poweroff"]
    when       = "destroy"
    on_failure = "continue"

    connection {
      type        = "ssh"
      host        = "${module.instance.private_ip}"
      user        = "${lookup(module.common.linux_user_map, var.os)}"
      private_key = "${file("${var.local_key_file}")}"
      agent       = false
    }
  }

  # Make sure instance has had some time to power down before attempting volume detachment
  provisioner "local-exec" {
    command = "sleep 30"
    when    = "destroy"
  }
}

@duality72 can I use your solution for instances in private subnets?

@GarrisonD very late reply here, but I don't see why not

Marking this issue as stale due to inactivity. This helps our maintainers find and focus on the active issues. If this issue receives no comments in the next 30 days it will automatically be closed. Maintainers can also remove the stale label.

If this issue was automatically closed and you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thank you!

I'm going to lock this issue because it has been closed for _30 days_ โณ. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

EmmN picture EmmN  ยท  3Comments

blaltarriba picture blaltarriba  ยท  3Comments

joelittlejohn picture joelittlejohn  ยท  3Comments

dvishniakov picture dvishniakov  ยท  3Comments

hashibot picture hashibot  ยท  3Comments