_This issue was originally opened by @landalex as hashicorp/terraform#15636. It was migrated here as a result of the provider split. The original body of the issue is below._
The private and public IP of the Master node EC2 instance should be exported upon creation of an EMR cluster, so they can be used for creating Route53 records that point to the cluster (among other usages).
Hey @landalex,
Thanks for opening this issue. I just checked quickly and couldn't find the IP address attribute when describing a cluster. I am not even sure that getting the IP address without tricky stuff is a doable thing :(
Couldn't you use the master_public_dns_name
attribute along with a Route53 A ALIAS
?
The configuration I'm using puts the EMR cluster on a private subnet, so the master_public_dns gets set to an internal (private) DNS record (not accessible outside of AWS). I don't want to use a public DNS record for this cluster, so I need to get the private IP from the EMR cluster (which is the feature I'm requesting).
I'm no expert on the inner workings of the Terraform EMR module, but is there no private IP available to export from the master node EC2 instance created? There are a bunch of EC2 attributes available, but it currently doesn't contain DNS/IP information.
As @Ninir mentioned there is no way to get an IP of the master EMR instance from the API:
http://docs.aws.amazon.com/ElasticMapReduce/latest/API/API_DescribeCluster.html
So there is really not much we can do on Terraform/SDK's side.
@landalex Do you mind explaining why are you keen on having an IP address instead of a hostname? Using hostnames instead of IPs is usually considered a good practice especially in AWS (and most other cloud environments) as instances are treated as ephemeral and may die and later come up with a different IP - in which case you'd need to redeploy anything referencing the old IP - which is in turn a manual process you probably don't want to be involved in, esp. if this happens at 3am.
I don't know much about EMR myself, but it may be intention from Amazon's side so they're able to quickly recover from a failure by launching a new instance with a new IP without causing further disruption to the rest of your infrastructure.
They don't expose IPs for ELBs or ALBs either, because DNS is used there as one of the load-balancing mechanisms and IPs (and number of IPs/instances) may change over time as ELBs/ALBs scale accordingly based on demand.
Since the DNS record that's being exposed by EMR is AWS internal, I can't resolve it on my local machine without adding AWS's DNS to my local machine's nameserver config. Since other people will be accessing this instance, it's impractical to have them all change their config, particularly because they aren't technical, and therefore using DNS won't work. The solution to this is to use an IP.
I was thinking the same about Amazon's approach to EMR, and them possibly not exposing an IP to make recovering from failure more smooth, except the DNS record created for the EMR cluster is of the form ip-**-**-**-***.us-west-2.compute.internal
, where the private IP address assigned is contained within the record, and so every time a new IP is assigned a new DNS record is created and assigned to the master node, which completely negates any ability for DNS to help with preventing disruption after an IP change.
The difference with ELBs/ALBs is that you can assign them to Auto-Scaling Groups, which EMR seems to deal with itself (it has the ability to auto-scale itself), but as far as I can tell you cannot attach an ELB/ALB to an EMR cluster (probably because EMR only wants you to access the Master node, but that doesn't help the fact that I can't access the Master node with the DNS entry it gives me either).
Either way I don't think there's anything we can do from Terraform's side as we're bound by the API limitations - so do you mind me closing this issue?
You can go ahead and close it, thanks for the help regardless!
I have a similar need. I resorted to this madness.
locals {
emr_ip = "${replace(replace(element(split(".", aws_emr_cluster.emr-cluster.master_public_dns),0),"ip-",""),"-",".")}"
}
And then just used ${local.emr_ip} where I needed. Hope this helps someone else.
or you can add this
data "aws_instance" "master" {
filter {
name = "private-dns-name"
values = ["${aws_emr_cluster.cluster.master_public_dns}"]
}
}
and access the IP address using "${data.aws_instance.master.private_ip}"
or you can add this
data "aws_instance" "master" { filter { name = "private-dns-name" values = ["${aws_emr_cluster.cluster.master_public_dns}"] } }
and access the IP address using
"${data.aws_instance.master.private_ip}"
data "aws_instance" "emr_datasource" {
filter {
name = "dns-name"
values = ["${aws_emr_cluster.cluster.master_public_dns}"]
}
}
@jcardosotalkdesk @tolgaingenc
Below block works:
data "aws_instance" "emr_datasource" {
filter {
name = "network-interface.private-dns-name"
values = ["${aws_emr_cluster.cluster.master_public_dns}"
}
}
Is there a way to have the CORE Instance private IP's as attribute reference? In order to navigate thru the logs of the Spark HistoryServer we would like to setup the dns A records for the core instances IP. as the spark UI Logs automatically builds the URL's using DNS Names. I see boto3 documentation has the PrivateIPAdress in the response structure.
https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-web-interfaces.html
I'm going to lock this issue because it has been closed for _30 days_ โณ. This helps our maintainers find and focus on the active issues.
If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!
Most helpful comment
I have a similar need. I resorted to this madness.
And then just used ${local.emr_ip} where I needed. Hope this helps someone else.