Nomad: RPC Advertise Address not Advertisable if -bind 0.0.0.0

Created on 1 Oct 2015  路  12Comments  路  Source: hashicorp/nomad

Getting the following:

[root@nomad ~]# nomad agent -server -bootstrap-expect 1 -data-dir /tmp/nomad -bind 0.0.0.0
==> WARNING: Bootstrap mode enabled! Potentially unsafe operation.
==> Starting Nomad agent...
==> Error starting agent: server setup failed: Failed to start RPC layer: RPC advertise address is not advertisable: [::]:4647

This happens only if -bind 0.0.0.0 is set, setting an interface IP clears the error. Occurs whether or not IPv6 is enable. Tested on CentOS 7 and CoreOS Stable (in client only mode)

stagthinking themnetworking typenhancement

Most helpful comment

it appears #941 was pulled out from this commit? https://github.com/hashicorp/nomad/commit/079e55e9935aab8b5ad13c54514f5ccfa7752371

is there a corresponding issue/explanation why the feature was removed?

All 12 comments

I encounter the same pb : bind on 0.0.0.0 is working, but there is a problem with advertise address... only one can be used.

In consul, i never encounter this problem... maybe it's implicit... but in nomad you have to specify it.

Here is a part of my config :

bind_addr = "0.0.0.0"
advertise {
    rpc = "my.ip:4647"
    serf= "my.ip:4648"
}

I have asked in the google gorup about a way to specify a dynamic var like $(hostname -i).

Ah, so in Consul I think this worked because we had a function that would scan for a private IP address, and automatically use that. There were a number of different opinions surrounding that option, and although in most cases it made things easier to start, it was intentionally left out so that it would always be clear which address we were binding and advertising. It just gets ambiguous if there are multiple interfaces at play.

I'm marking this as a thinking ticket, because the UX can probably be improved. Thanks for reporting.

Maybe, in the case the specific IP address of the server isn't known in advance, its advertise addresses could be derived from a given subnet. Something in the likes of:

advertise {
  rpc = "10.0.0.0/8:4657"
  serf = "10.0.0.0/8:4648"
}

I already have a patch somewhere that would look for the first interface to have an IP on the given subnet and use it for advertising, if it is deemed interesting.

+1 to the subnet idea, or something similar to avoid having to know the IP up front

I already have a patch somewhere that would look for the first interface to have an IP on the given subnet and use it for advertising, if it is deemed interesting.

@apognu That would be welcome!

Give me a few hours, I'll submit a PR.

I'd like to add another thing here, this should behave like consul does, so -bind=:: should work. (Bind to any ipv4 or ipv6 ip address available) as well as the commandline option -advertise=

In our consul environment i set all nodes to -bind=:: and then -advertise=

On all masters i setup a secondary ip-address of 10.255.255.255 on loopback (this is anycasted through our network) so any client anywhere within our network will just "-retry-join=10.255.255.255" and find the closest running master available.

serf/raft seems to take care of the rest, the client finds a master and gets the full list of all other masters and then just seems to ignore the -retry-join ip.

I'm also running into this. Consul's behavior seems to be what most people will like to see, including myself. Also, instead of giving an IP address, I would prefer to specify the network interface.

Based on the feedback here and feedback that we received in Consul we're considering the following:

  1. Support bind based on named interface (eth1).
  2. Support bind based on CIDR range (10.1.0.0/16). This allows you to get pretty granular if you have multiple interfaces or multiple IPs per interface.

Specifying the IP (as is currently supported) is pretty straightforward. If we do interface or CIDR we end up with a lot of messy edge cases.

  • Some interfaces may have aliases, like eth1:0, on linux.
  • Some interfaces may have multiple IPs associated with them (e.g. IPv4 and IPv6).

    • Should we use the "first" IP that matches a specific interface or CIDR block instead of using all of them?

    • If so how do we define "first"?

  • IPv6 CIDR is getting into weird territory.
  • Any other cross-platform considerations?

We end up using similar logic in at least 3 places so this is a good opportunity to factor it out:

  1. Binding Nomad APIs when the agent starts
  2. Detecting available networks during fingerprinting
  3. Placing tasks into specific networks

I'd like to add another thing here, this should behave like consul does, so -bind=:: should work. (Bind to any ipv4 or ipv6 ip address available) as well as the commandline option -advertise=
In our consul environment i set all nodes to -bind=:: and then -advertise=

I'm not sure this is still supported in Consul 0.6.0. Also, since Nomad does not use serf across the entire cluster (only amongst the server nodes) we may not be able to do things exactly the same way that Consul does them.

Bit by this issue as well, +1 for binding to named interface.

Closing this as #941 lets you bind by interface name

it appears #941 was pulled out from this commit? https://github.com/hashicorp/nomad/commit/079e55e9935aab8b5ad13c54514f5ccfa7752371

is there a corresponding issue/explanation why the feature was removed?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

hynek picture hynek  路  3Comments

jippi picture jippi  路  3Comments

joliver picture joliver  路  3Comments

byronwolfman picture byronwolfman  路  3Comments

jrasell picture jrasell  路  3Comments