Hey guys,
I have a scenario where if I decide to bring down an EC2 instance in order to update my Elasticsearch version to a new version (this setup is for my Elasticsearch cluster on AWS), I would like to detach my EBS volume and re-attach it to the new EC2 instance with the updated Elasticsearch AMI.
I believe I can already detach an EBS volume on EC2 destruction this way:
ebs_block_device {
device_name = "/dev/sdh"
volume_type = "io1"
iops = "4000"
volume_size = "500"
**delete_on_termination = "false"**
}
}
However, I'm still trying to find out how to re-attach this existing EBS volume to a new EC2 instance in Terraform prior to bringing that EC2 instance/node back into my cluster.
Can you guys please help me with this?
Thanks,
Ben.
Same issue here.
I also tried to create an aws_ebs_volume and attach it with aws_volume_attachment. But here I don't see how we can bootstrap (mkfs + mount) the disk with cloudinit or a provisioning script.
I've managed to reattach a volume (including mounting) using the user_data
field:
user_data = "#!/bin/bash\nmkdir /data; mount /dev/xvdh /data;
resource "aws_instance" "web" {
ami = "ami-b7f0f987"
instance_type = "t2.micro"
availability_zone = "${var.aws_region}a"
vpc_security_group_ids = ["${aws_security_group.default.id}"]
iam_instance_profile = "ecsInstanceRole"
key_name = "us-west-ecs"
user_data = "#!/bin/bash\nmkdir /data; mount /dev/xvdh /data; service docker restart; echo 'ECS_CLUSTER=${aws_ecs_cluster.cms.name}\nECS_ENGINE_AUTH_TYPE=dockercfg\nECS_ENGINE_AUTH_DATA={\"${var.registry}\": {\"auth\": \"${var.auth}\",\"email\": \"${var.email}\"}}' >> /etc/ecs/ecs.config;"
}
# Attach DBMS Volume
resource "aws_volume_attachment" "ebs_att" {
device_name = "/dev/xvdh"
volume_id = "<volume-id to reattach>"
instance_id = "${aws_instance.web.id}"
}
If one wants to use an EBS volume attached to a docker container, like in the example above. Make sure service docker restart
is included. One _has to_ restart the docker daemon.
Otherwise the attached volume is not used, unfortunately the aws init process won't tell you that.
No update on this ?
Should we use aws_ebs_volume instead ?
Regarding your question
But here I don't see how we can bootstrap (mkfs + mount) the disk with cloudinit or a provisioning script
I'm pretty sure you could bootstrap it via the user_data
field, doing something like:
#!/bin/bash
mkfs -t ext4 /dev/xvdh #boostrapping
mkdir /data #create mount point
mount /dev/xvdh /data #mount it
#service docker restart (if you want to use it as a docker volume)
Should we use aws_ebs_volume instead ?
Well, I guess for the creation it would be correct
Does this helps you?
Well, you can do that if you are using an "aws_ebs_volume" resource. But not an "ebs_block_device" insinde an "aws_instance" resource.
I was hoping doing the bootstrap of the ebs volume outside of CloudInit since I have a lot of difference instances and most of the time the only differences are about EBS.
O.k., now I see your point. It's an interesting question.
Using the proposition of @phinze for now. See : https://github.com/hashicorp/terraform/pull/2050
Hey all – I believe this issue has been resolved with aws_volume_attachment
. Mounting needs to be done in the user_data
section as mentioned. Thanks!
I have an existing volume that I want to attach to an Amazon Linux instance and ensure that upon reboot it will be reattached.
riak-user-data.sh
:
#!/bin/bash
mkdir /data
echo '/dev/sdh /data ext4 defaults,nofail,noatime,nodiratime,barrier=0,data=writeback 0 2' >> /etc/fstab
mount -a
resource "aws_instance" "riak" {
...
user_data = "${file("riak-user-data.sh")}"
provisioner "remote-exec" {
inline = [
"sudo mkdir -m 0755 -p /etc/ansible/facts.d",
]
connection {
user = "ec2-user"
bastion_host = "${data.terraform_remote_state.vpc.jumphost_eip_public_ip}"
bastion_user = "ubuntu"
}
}
}
resource "aws_ebs_volume" "riak" { ... }
resource "aws_volume_attachment" "riak" {
device_name = "/dev/sdh"
volume_id = "${aws_ebs_volume.riak.id}"
instance_id = "${aws_instance.riak.id}"
}
The process runs as follows, eventually timing out because the remote-exec
provisioner is never able to connect:
aws_instance.riak: Creating...
aws_instance.riak: Still creating... (10s elapsed)
aws_instance.riak (remote-exec): Using configured bastion host...
aws_instance.riak (remote-exec): Host: 34.XXX.XXX.XXX
aws_instance.riak (remote-exec): User: ubuntu
aws_instance.riak (remote-exec): Password: false
aws_instance.riak (remote-exec): Private key: false
aws_instance.riak (remote-exec): SSH Agent: true
...
aws_instance.riak: Still creating... (5m30s elapsed)
Error applying plan:
1 error(s) occurred:
* aws_instance.riak: 1 error(s) occurred:
* timeout
It seems that the user_data
is leading to a system that fails to start SSH; likely the system is failing to completely boot (a clear warning about fstab in AWS docs). When I remove the specified user_data
the machine will successfully boot and then we see:
aws_volume_attachment.riak: Creating...
device_name: "" => "/dev/sdh"
force_detach: "" => "<computed>"
instance_id: "" => "i-01c2d24bbeecb4586"
skip_destroy: "" => "true"
volume_id: "" => "vol-0d4d48ac7cdef77ec"
aws_volume_attachment.riak: Still creating... (10s elapsed)
aws_volume_attachment.riak: Still creating... (20s elapsed)
aws_volume_attachment.riak: Creation complete (ID: vai-3330874248)
Apply complete! Resources: 2 added, 0 changed, 0 destroyed.
Of course, the volume is attached but not mounted.
My questions is, how can any cloud-init user_data
that attempts to mount the device in any way ever successfully execute in a reliable way considering that the attachment cannot occur until after the instance ID is obtained? I suppose that if I have no remote-exec
then the API call to create the instance will return fast enough with an instance ID so that the attachment API call can be made before the cloud-init user_data
is executed?
I would appreciate knowing if I am thinking of this all wrong. Thanks!
After much testing, the most reliable solution has turned out to be using a provisioner on the aws_volume_attachment
. No matter how long it takes to bring up the aws_instance
, the attachment will not be mounted until the host is booted and SSH is available for provisioning.
resource "aws_volume_attachment" "riak" {
skip_destroy = true
provisioner "remote-exec" {
script = "attach-data-volume.sh"
connection {
host = "${aws_instance.riak.public_ip}"
}
}
}
#!/bin/bash
devpath=$(readlink -f /dev/sdh)
sudo file -s $devpath | grep -q ext4
if [[ 1 == $? && -b $devpath ]]; then
sudo mkfs -t ext4 $devpath
fi
sudo mkdir /data
sudo chown riak:riak /data
sudo chmod 0775 /data
echo "$devpath /data ext4 defaults,nofail,noatime,nodiratime,barrier=0,data=writeback 0 2" | sudo tee -a /etc/fstab > /dev/null
sudo mount /data
# TODO: /etc/rc3.d/S99local to maintain on reboot
echo deadline | sudo tee /sys/block/$(basename "$devpath")/queue/scheduler
echo never | sudo tee /sys/kernel/mm/transparent_hugepage/enabled
Ran with @aiwilliams's example a bit:
Is @aiwilliams approach still considered the best way to do this? Notably, this doesn't work well (at all?) if your aws_instance
is in a private subnet.
@mwakerman i suppose you can set a bastion_host in your provisioner
``` provisioner "remote-exec" {
script = "attach-data-volume.sh"
connection {
host = "${aws_instance.riak.public_ip}"
bastion_host = xxxxx
}
}
```
i have the same issue and just discovered that provisionners can be attached to any resource, not just the instance, and that's going to fix a big issue for me.
Thanks @earzur, that should work for us.
So it looks like terraform doesn't allow you to use a passphrase encrypted private_key
in the connection
block of the remote-exec
provisioner. May be able to temporarily add and then revoke unencrypted keys but might also try and do the mkfs
and mount
in user-data after polling until the EBS volume has been attached.
Ended up doing it all in user-data by giving the instance an IAM role that included anec2:Describe*
policy and waiting until the EBS volume attaches with (credit):
while [ -e /dev/xvdh ] ; do sleep 1 ; done
EC2_INSTANCE_ID=$(wget -q -O - http://169.254.169.254/latest/meta-data/instance-id || die \"wget instance-id has failed: $?\")
EC2_AVAIL_ZONE=$(wget -q -O - http://169.254.169.254/latest/meta-data/placement/availability-zone || die \"wget availability-zone has failed: $?\")
EC2_REGION="`echo \"$EC2_AVAIL_ZONE\" | sed -e 's:\([0-9][0-9]*\)[a-z]*\$:\\1:'`"
#############
# EBS VOLUME
#
# note: /dev/sdh => /dev/xvdh
# see: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/device_naming.html
#############
# wait for EBS volume to attach
DATA_STATE="unknown"
until [ $DATA_STATE == "attached" ]; do
DATA_STATE=$(aws ec2 describe-volumes \
--region $${EC2_REGION} \
--filters \
Name=attachment.instance-id,Values=$${EC2_INSTANCE_ID} \
Name=attachment.device,Values=/dev/sdh \
--query Volumes[].Attachments[].State \
--output text)
echo 'waiting for volume...'
sleep 5
done
echo 'EBS volume attached!'
I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.
Most helpful comment
After much testing, the most reliable solution has turned out to be using a provisioner on the
aws_volume_attachment
. No matter how long it takes to bring up theaws_instance
, the attachment will not be mounted until the host is booted and SSH is available for provisioning.