Terraform: providers/aws: Failure associating EIP: InvalidAllocationID.NotFound: The allocation ID 'eipalloc-9b0b7cfe' does not exist

Created on 5 May 2015  ·  21Comments  ·  Source: hashicorp/terraform

bug provideaws

Most helpful comment

@c4milo I couldn't create a minimal example, but I found the cause.

In resource_aws_eip.go, Terraform creates the EIP (resourceAwsEipCreate), then immediately assigns it to an instance/network interface (resourceAwsEipUpdate), and then immediately queries back its status (resourceAwsEipRead).

Not sure if it's the AWS region we're using (us-west-2), or the huge amount of traffic on our account, but sometimes AWS seems to return from the creation call before the resource is fully created. In such a case, the next call (resourceAwsEipUpdate) fails and we get the error in this issue:

* aws_eip.frontend.1: Failure associating EIP: InvalidAllocationID.NotFound: The allocation ID 'eipalloc-220f8146' does not exist

Sometimes, the second call goes through ok, but then the 3rd one fails (resourceAwsEipUpdate) producing the following:

* Resource 'aws_eip.frontend_haproxy' does not have attribute 'public_ip' for variable 'aws_eip.frontend_haproxy.public_ip'

We were hitting both issues quite a bit, and I was able to solve it just by adding 5 seconds of sleep before the call to resourceAwsEipUpdate and before the call to resourceAwsEipRead.

I think the better solution is to retry these calls on failure. I can do a PR if you want though I've never done Go.

All 21 comments

This is happening in master branch.

Can you show us the config?

provider "aws" {
    access_key = "${var.access_key}"
    secret_key = "${var.secret_key}"
    region = "${var.region}"
}

// This is so we can get the DNS zone ID in order to create DNS records.
resource "terraform_remote_state" "network" {
    backend = "_local"

    config {
        path = "../network/terraform.tfstate"
    }
}

// This is so we can join Consul pool as a client member.
resource "terraform_remote_state" "consul" {
    backend = "_local"

    config {
        path = "../consul/terraform.tfstate"
    }
}

resource "aws_eip" "salt_master_a" {
    instance = "${aws_instance.salt_master_a.id}"
    vpc = true
}

resource "aws_eip" "salt_master_b" {
    instance = "${aws_instance.salt_master_b.id}"
    vpc = true
}

resource "aws_route53_record" "dns_a" {
    zone_id = "${terraform_remote_state.network.output.dns_zone_id}"
    name = "${var.salt_servername}1.${var.env}.managedbyq.com"
    type = "A"
    ttl = "30"
    records = [
        "${aws_eip.salt_master_a.public_ip}",
    ]
}

resource "aws_route53_record" "dns_b" {
    zone_id = "${terraform_remote_state.network.output.dns_zone_id}"
    name = "${var.salt_servername}2.${var.env}.managedbyq.com"
    type = "A"
    ttl = "30"
    records = [
        "${aws_eip.salt_master_b.public_ip}",
    ]
}


// EC2 Instances Zone A
resource "aws_instance" "salt_master_a" {
    ami = "${lookup(var.amis, var.region)}"
    instance_type = "${var.instance_type}"
    key_name = "${var.key_name}"
    availability_zone = "${var.zone_a}"
    subnet_id = "${terraform_remote_state.network.output.subnet_a_id}"
    vpc_security_group_ids = ["${terraform_remote_state.network.output.common_security_group_id}"]
    tags {
        Name = "${var.env}-${var.salt_servername}-a-${count.index+1}"
    }

    connection {
        user = "${var.key_user}"
        key_file = "${var.key_path}"
    }

    provisioner "file" {
        source = "../consul/files/consul/common/"
        destination = "/tmp"
    }

    provisioner "file" {
        source = "files/"
        destination = "/tmp"
    }

    provisioner "remote-exec" {
        inline = [
            "echo ${var.zone_a} > /tmp/consul-datacenter",
            "echo ${terraform_remote_state.consul.output.consul_a_ip_join} > /tmp/consul-server-private-addr",
            "sed 's/@@SALT_MASTER_ID@@/${var.env}-${var.salt_servername}-a-${count.index+1}/g' /tmp/service.json > /tmp/foo",
            "sed 's/@@SALT_MASTER_ADDR@@/${self.public_ip}/g' /tmp/foo > /tmp/foo2",
            "mv /tmp/foo2 /tmp/service.json",
            "sed 's/@@ENVIRONMENT@@/${var.env}/g' /tmp/master > /tmp/lala",
            "mv /tmp/lala /tmp/master"
        ]
    }

    // Installs Consul agent and Salt master.
    provisioner "remote-exec" {
        scripts = [
            "scripts/install.sh",
            "../consul/scripts/consul/install.sh",
            "../consul/scripts/consul/service.sh",
        ]
    }
}

// EC2 Instances Zone B
resource "aws_instance" "salt_master_b" {
    ami = "${lookup(var.amis, var.region)}"
    instance_type = "${var.instance_type}"
    key_name = "${var.key_name}"
    availability_zone = "${var.zone_b}"
    subnet_id = "${terraform_remote_state.network.output.subnet_b_id}"
    vpc_security_group_ids = ["${terraform_remote_state.network.output.common_security_group_id}"]
    tags {
        Name = "${var.env}-${var.salt_servername}-b-${count.index+1}"
    }

    connection {
        user = "${var.key_user}"
        key_file = "${var.key_path}"
    }

    provisioner "file" {
        source = "../consul/files/consul/common/"
        destination = "/tmp"
    }

    provisioner "file" {
        source = "files/"
        destination = "/tmp"
    }

    provisioner "remote-exec" {
        inline = [
            "echo ${var.zone_b} > /tmp/consul-datacenter",
            "echo ${terraform_remote_state.consul.output.consul_b_ip_join} > /tmp/consul-server-private-addr",
            "sed 's/@@SALT_MASTER_ID@@/${var.env}-${var.salt_servername}-b-${count.index+1}/g' /tmp/service.json > /tmp/foo",
            "sed 's/@@SALT_MASTER_ADDR@@/${self.public_ip}/g' /tmp/foo > /tmp/foo2",
            "mv /tmp/foo2 /tmp/service.json",
            "sed 's/@@ENVIRONMENT@@/${var.env}/g' /tmp/master > /tmp/lala",
            "mv /tmp/lala /tmp/master",
        ]
    }

    // Installs Consul agent and Salt master.
    provisioner "remote-exec" {
        scripts = [
            "scripts/install.sh",
            "../consul/scripts/consul/install.sh",
            "../consul/scripts/consul/service.sh"
        ]
    }
}

output "saltmaster1_eip" {
    value = "${aws_eip.salt_master_a.public_ip}"
}

output "saltmaster2_eip" {
    value = "${aws_eip.salt_master_b.public_ip}"
}

output "saltmaster1_dns" {
    value = "${aws_route53_record.dns_a.name}"
}

output "saltmaster2_dns" {
    value = "${aws_route53_record.dns_b.name}"
}

I got passed that issue, but I don't remember what I exactly did :/. I think I just deleted the state file and started over.

Thats a good hint. I'm quite confident this is an eventual consistency issue with AWS that we'll have to retry association. This is a strong example of where retrying a node wouldn't work, since association is a substep within the creation process, so we have to do a more granular retry just on association.

@mitchellh I came here to post the exact same warning:

* InvalidAssociationID.NotFound: The association ID 'eipassoc-88b26ce1' does not exist

However, I believe that this started to show for me once I started to manually assign floating IPs from the machines.

The workflow that causes this issue:

  • boot machines + create floating IP without associating it to any machine
  • have the machines decide which one should use the floating IP and associate the IP to that machine using the AWS CLI tools
  • destroy infrastructure using Terraform
  • error occurs

I think this constitutes a valid setup in which a floating IP is used as a load-balancing solution. In our case, the floating IP is attached to a dns record, which is used in our VPN setup. We're using CoreOS and Fleet to manage units, so we don't know beforehand on which machine the VPN server will be started. This is why we attach the floating IP using the AWS CLI tools from inside the machines.

However, this then (apparently) changes the association ID, which Terraform uses to find the resource.

:+1: to @JeanMertz's comment, it applies to something I am currently doing and is a concern.

Hello! Does anyone have a very small, bare minimum config to help me reproduce this? There's a lot of remote state in Camilo's that I'm not sure I can reproduce well. I've noticed a few other issues with our EIP resource and would like to tackle them and this while I'm there.

If you do have a config I would greatly appreciate it! Be sure to remove any secrets!

It is difficult to reproduce since it seems to be a race condition in AWS
side.

On Mon, May 18, 2015 at 6:10 PM Clint [email protected] wrote:

Hello! Does anyone have a very small, bare minimum config to help me
reproduce this? There's a lot of remote state in Camilo's that I'm not sure
I can reproduce well. I've noticed a few other issues with our EIP resource
and would like to tackle them and this while I'm there.

If you do have a config I would greatly appreciate it! Be sure to remove
any secrets!


Reply to this email directly or view it on GitHub
https://github.com/hashicorp/terraform/issues/1815#issuecomment-103228430
.

@catsby just try creating an EIP and a Route53 record pointing to that EIP. At some point it should show up.

I am experiencing this issue following along with the getting started documentation:

https://terraform.io/intro/getting-started/dependencies.html

Deleting the state file fixed it.

I'm having the exact same problem - sometimes my apply will pass, sometimes it will fail. If it fails, I can just re-run the apply without making any changes to anything and it will pass. It's like it tries to associate the IP before something's ready.

Hey Friends – I'm doing a sweep of older issues and circled back here to see if I could reproduce this. I tried several times with the following config:

provider "aws" {
  region = "us-west-2"
}

resource "aws_eip" "ip" {
  count    = 4
  instance = "${element(aws_instance.example.*.id, count.index)}"
  vpc      = true
}

resource "aws_instance" "example" {
  count                       = 4
  ami                         = "ami-dfc39aef"
  instance_type               = "t2.micro"
  associate_public_ip_address = true
}

resource "aws_route53_zone" "primary" {
  name = "tftesting.com"
}

resource "aws_route53_record" "dns_a" {
  zone_id = "${aws_route53_zone.primary.id}"
  name    = "examplerecord"
  type    = "A"
  ttl     = "30"

  records = [
    "${aws_eip.ip.*.public_ip}",
  ]
}

It creates 4 instances and 4 eips and tries to add them to a route 53 record, and after several plan->apply->destroy cycles I was unable to reproduce this issue.

If you have another config that demonstrates this I can take another look, but until them I'm going to close this. Thanks!

We are hitting that issue quite frequently. Should I open a new one?

@cmlad, yes please, with a configuration that can be used to reproduce the issue.

@c4milo, I'm still trying to reproduce the issue with stand-alone config, as the config we use is part of a huge deploy system which I cannot share.

In the meantime, here is the debug-level log of the issue.

It seems to be a race condition where AWS has not yet prepared/creating the EIP, so I think for reproducing you need to run on a machine that is physically close (we run on an EC2 instance) and fast enough to shoot commands quickly.

It happens in roughly a quarter to a fifth of our deploys.

I was able to work around the issue by adding sleep:

resource "aws_eip" "git" {
    instance = "${aws_instance.git.id}"
    vpc = true

    provisioner "local-exec" {
      command = "echo Waiting ${var.eip_propagation_wait_time} seconds for EIP to propagate; sleep ${var.eip_propagation_wait_time}"
    }
}

eip_alloc_error.txt

@c4milo I couldn't create a minimal example, but I found the cause.

In resource_aws_eip.go, Terraform creates the EIP (resourceAwsEipCreate), then immediately assigns it to an instance/network interface (resourceAwsEipUpdate), and then immediately queries back its status (resourceAwsEipRead).

Not sure if it's the AWS region we're using (us-west-2), or the huge amount of traffic on our account, but sometimes AWS seems to return from the creation call before the resource is fully created. In such a case, the next call (resourceAwsEipUpdate) fails and we get the error in this issue:

* aws_eip.frontend.1: Failure associating EIP: InvalidAllocationID.NotFound: The allocation ID 'eipalloc-220f8146' does not exist

Sometimes, the second call goes through ok, but then the 3rd one fails (resourceAwsEipUpdate) producing the following:

* Resource 'aws_eip.frontend_haproxy' does not have attribute 'public_ip' for variable 'aws_eip.frontend_haproxy.public_ip'

We were hitting both issues quite a bit, and I was able to solve it just by adding 5 seconds of sleep before the call to resourceAwsEipUpdate and before the call to resourceAwsEipRead.

I think the better solution is to retry these calls on failure. I can do a PR if you want though I've never done Go.

terraform destroy -target [resource].[name]
...
(ex. terraform destroy -target aws__eip.nat)

The above likely won't work but is worth a shot. The likely issue is that the "ghost" NAT GW is living in the remote tfstate file in S3 or elsewhere. Just delete the state file and rebuild. If you're not storing it remotely, delete the local state file.

I had similar issue with terraform trying to destroy aws_eip that only existed in the state file (I have removed the resource manually), and terraform was complaining about not being able to find it:

  • aws_eip.jenkins: InvalidAllocationID.NotFound: The allocation ID 'eipalloc-cc031ca9' does not exist
    status code: 400, request id: fe196f32-7294-4deb-bd31-b1eff650b117

So I have solved by only taking this eip out of the state file.

Terraform Version: 0.10.3

@cs-mahmoud-khateeb that sounds like a different issue to me.

I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

Was this page helpful?
0 / 5 - 0 ratings