I have got my datacenter dc1 with various nodes, and i''ve bootstrapped datacenter2 dc2.
As told on this link (https://www.consul.io/docs/guides/datacenters.html), i need to launch command
consul join -wan
but documentation isn't clear on this things:
Hi @shakisha I'll leave this open as a reminder to improve the docs in this section. Here's some info that should help:
what is the port number for wan and should be this one listening on public wan interface?
The port you'd want to use for the join is the "Serf WAN" port which defaults to 8302. You'd need the Consul servers participating in the WAN to expose this port bound to an interface that's reachable from the other Consul servers (they need to form a fully connected mesh). You'll need TCP and UDP access to this port from any firewalls.
how to place this consul join -wan into configuration file without having to launch manually the command into cli?
The best way to do this is via the retry_join_wan config option - https://www.consul.io/docs/agent/options.html#retry_join_wan.
this part "Of course, all server nodes must be able to talk to each other. Otherwise, the gossip protocol as well as RPC forwarding will not work." is referring about which datacenter?
The gossip part is the "Serf WAN" port discussed above. You'll also need to expose the "Server RPC" port 8300/tcp in order for RPC forwarding to work. All servers participating in the WAN should be able to reach each other on 8300/tcp.
Hi James @slackpad , thank you really much for your answer;
i still have got troubles for making this wan consul work properly, this is my configuration:
{
"server": true,
"datacenter": "lon01",
"data_dir": "/var/consul",
"log_level": "info",
"enable_syslog": false,
"encrypt": "blablabla",
"bind_addr": "MY LAN INTERFACE",
"retry_join_wan": ["IP ADDRESS OF REMOTE CONSUL SERVER"]
}
This remote consul server has got in the configuration the bind address configured as lan.
When i start consul on wan client i receive an error that the wan is available only in server mode.
After i switch to server, this is was i receive:
2016/04/12 17:55:31 [ERR] agent: failed to sync remote state: No cluster leader
2016/04/12 17:55:32 [ERR] agent: coordinate update error: No cluster leader
2016/04/12 17:55:35 [INFO] agent: (WAN) joining: [SERVERIP]
2016/04/12 17:55:35 [INFO] agent: (WAN) joined: 0 Err: dial tcp SERVERIP:8302: getsockopt: connection refused
2016/04/12 17:55:35 [WARN] agent: Join -wan failed: dial tcp SERVERIP:8302: getsockopt: connection refused, retrying in 30s
Where i'm lost? in the documentation i cannot find nothing about this issue ;(
Hello @slackpad , still not have got success on this :-(
Have you got any kinda of update for me? Thank you a lot!
Hi @shakisha I think it might be "bind_addr": "MY LAN INTERFACE" - is that interface reachable by the other server (that's the TCP connection failing to SERVERIP:8302)? If there are multiple interfaces then you can remove that config to bind them all and then set advertise_addrs for serf_lan and serf_wan for the IP addresses for that server to advertise others should contact it on.
@slackpad : this is exactly the part of the docs which I didn't get.
on the bind address I have only lan interface because I have got 10 clients in the local lan (datacenter0)
in datacenter "lon1" I have the configuration listed above and connection refuse error, but from tcpdump in pretty sure that lon1 server is contacting the server at "datacenter0".
should I totally remove bind address and use https://www.consul.io/docs/agent/options.html#advertise_addrs or put only advertise wan address?
will be secure keeping only that port with encryption enabled?
UPDATE:i have tried putting these values in the configuration of datacenter0 and lon1:
"advertise_addrs": {
"serf_wan": "WAN_IP_ADDRESS:8302",
"serf_lan": "LAN_IP_ADDRESS:8301",
"rpc": "LAN_IP_ADDRESS:8300"
}
but didn't worked, still the same "getsockopt: connection refused" (firewalls are disabled on both side).
Also i've tried to remove bind_address, but even with serf_lan,wan and rpc parameters, consul refused to start because of multiple private ip address on the machine.
I'm desesperated
Hi @shakisha and sorry for the trouble on this one. It looks like there's a bug where it's doing the private IP check, even though you've specified the advertise_addrs structure. The individual form of these configs looks like they should properly skip that check:
{
"advertise_addr": "LAN_IP_ADDRESS",
"advertise_addr_wan": "WAN_IP_ADDRESS",
"bind_addr": "0.0.0.0"
}
You are right that this type of configuration will result in Consul binding to all interfaces on a machine so you'd want to enable encryption for the serf ports and TLS w/verification for the RPC port. Many folks bind everything to the LAN interface and use a NAT / VPN / tunneled connection to avoid exposing anything externally.
Also noting here that we should allow the configuration of a different bind address for LAN and WAN interfaces - this isn't currently supported. Linking https://github.com/hashicorp/consul/issues/473 since it's related.
Hi @slackpad,
Many folks bind everything to the LAN interface and use a NAT / VPN / tunneled connection to avoid exposing anything externally
this is very a GREAT idea, but how to do that if vpn is on tun0 interface and i have to listen on both eth1 (lan) and tun0 (vpn)?
@shakisha I was thinking for the case where some other box provides the gateway - if the server itself has the tunnel then there's currently no way to support this other than bind 0.0.0.0 per https://github.com/hashicorp/consul/issues/1914#issuecomment-210617451 and possibly some firewall rules to pare down the configuration. Once we get #473 and the option to bind (potentially multiple addresses) for each function (wan, lan, rpc) then this should get easier to configure.
Thanks for your answer, but actually if i bind to 0.0.0.0 will bind to all interfaces? i can just open to all the interfaces and use iptables to block unwanted ports, tell me if i'm correct
Yes - this config should bind to all interfaces:
{
"advertise_addr": "LAN_IP_ADDRESS",
"advertise_addr_wan": "WAN_IP_ADDRESS",
"bind_addr": "0.0.0.0"
}
The advertise addresses tell other hosts in the cluster how to contact the server.
can i just put ?
{
"bind_addr": "0.0.0.0"
}
Because i want to use lan and tun0 (vpn)... or you advice to use tun0 instead of wan_ip_address on this config?
"advertise_addr": "LAN_IP_ADDRESS" and "advertise_addr_wan": "WAN_IP_ADDRESS" shouldn't affect any binding but they will prevent the "multiple private IP" error since you have several interfaces and Consul won't be able to decide which one to use.
ok, but will this config listen to packets coming from tun0 interface?
It should - it will bind to everything!
Hi to all, are there any updates about this issue? :-)
@shakisha please give the latest code in master a twirl and let us know. The syntax for selecting IP addresses, such as getting the first "usable" IP address on an interface, or any other manner of network craziness is likely possible now as a template evaluated address parameter can be passed to Consul addresses (e.g. -bind or bind_addr, or any other *_addr-like parameter):
-bind='{{ GetInterfaceIP "eth0" }}'
-bind='{{ GetAllInterfaces | include "network" "10.99.0.0/24" }}'
-bind='{{ GetDefaultInterfaces | include "network" "10.99.0.0/24" | sort "size,address" | attr "address" }}'
-bind='{{ GetAllInterfaces | exclude "rfc" "6890" | sort "type,size,address" | include "flags" "up|forwardable" | attr "address" }}'
Very few people should need to do anything as obscene as shown in the last example, but the functionality is there should you need it. I gave this issue a quick read and it seems like you could use this configuration enhancement to select the tun0 address for the bind_addr and then use a different parameter value for the advertise_addr. This doesn't solve your exact problem of listening on multiple interfaces or IPs, but it's on its way in the near future.
With the sockaddr command you can experiment with getting the right template syntax with the eval sub-command, for instance:
$ sockaddr eval 'GetAllInterfaces | include "network" "10.99.0.0/24" | sort "size,address" | attr "address"'
10.99.0.5
$ sockaddr eval 'GetInterfaceIP "eth0"'
10.99.0.5
There is now a configurable template language for examples and docs) behind this that you can use to create a customizable heuristic that should allow you to get whatever it is that you need from your environment when using an immutable image (see hashicorp/go-sockaddr/template and cmd/sockaddr.
If this doesn't solve your issue, let us know.
Hello @sean- and @slackpad ;
thanks for your answers. I explain better my situation so we can make a big step next to enhance this consul version;
@sean- of course, i've seen your reply just yesterday; can you confirm that i can test directly with the 0.7.2 rc?
Actually i have got two datacenters:
DC1 datacenter with only a single consul, wan connected machine.
DC2 datacenter with a 6 consul, lan connected machines and every one with a public ip address.
basically, from DC1 datacenter, i want query the DC2 datacenter to have got a list of pubblic ip addresses of all the DC2 machines (to use it with consul-template, another project from you which i love).
So when running this command on the single consul machine of DC1:
curl http://localhost:8500/v1/catalog/nodes?dc=DC1
or
curl http://localhost:8500/v1/catalog/nodes?dc=DC2
i get in BOTH CASES
rpc error: failed to get conn: dial tcp IP_ADDRESS_OF_THE_NODE_ON_DC2
:8300: getsockopt: connection refused
these are the configurations that i used:
(@sean- i'm sorry if i didn't test your way like
GetInterfaceIP "eth0"
but i don't know the exact syntax to put it inside configuration file; i don't like to use the command line :-) )
CONFIGURATION OF NODE ON DC1
{
"server": true,
"datacenter": "DC1",
"data_dir": "/var/consul",
"log_level": "WARN",
"enable_syslog": false,
"encrypt": "secretkey",
"bind_addr": "private eth1",
"advertise_addr_wan": "public ip of eth0",
"retry_join_wan": ["public ip of node2 on dc2", "public ip of node3 on dc2", "public ip of node4 on dc2"]
}
CONFIGURATION OF NODE ON DC2
{
"bootstrap_expect": 3,
"server": true,
"datacenter": "DC2",
"data_dir": "/var/consul",
"log_level": "WARN",
"bind_addr": "private eth1",
"serf_wan_bind": "public ip of eth0",
"advertise_addr_wan": "public ip of eth0",
"retry_join": ["private lan ip of node3", "private lan ip of node4"],
"retry_join_wan": ["public ip of node1 on dc2", "public ip of node3 on dc2"],
"encrypt": "secretkey",
}
@shakisha Try with the following config snippets:
CONFIGURATION OF NODE ON DC1
{
"server": true,
"datacenter": "DC1",
"data_dir": "/var/consul",
"log_level": "WARN",
"enable_syslog": false,
"encrypt": "secretkey",
"bind_addr": "{{ GetInterfaceIP \"eth1\" }}",
"advertise_addr_wan": "{{ GetInterfaceIP \"eth0\" }}",
"retry_join_wan": ["public ip of node2 on dc2", "public ip of node3 on dc2", "public ip of node4 on dc2"]
}
You will have to plug in the IP/DNS addresses for retry_join_wan. It would be more concise if you didn't want to pick out an IP address from a particular interface name because then you could use something like GetPublicIP or GetPublicInterfaces, or GetPrivateIP or GetPrivateInterfaces. The same type or style of configuration could be used for the second configuration block.
If you want to test and experiment via the CLI, you can via:
$ go get -u github.com/hashicorp/go-sockaddr/cmd/sockaddr
$ sockaddr eval 'GetInterfaceIP "eth0"'
$ sockaddr eval 'GetAllInterfaces | include "name" "eth0" | sort "type,size" | include "RFC" "6890" | include "flags" "up|forwardable" | attr "address"'
Just be sure to re-escape the double quotes before injecting the working template back into your Consul config. Keep us posted!
@sean- doing now 馃憤
Got
==> Error starting agent: Advertise WAN address resolution failed: Unable to parse address template "{{ GetInterfaceIP \"eth0\" | sort \"type,size\" | exclude \"RFC\" \"6890\" | include \"flags\" \"up|forwardable\" | attr \"address\" }}": unable to execute sockaddr input "{{ GetInterfaceIP \"eth0\" | sort \"type,size\" | exclude \"RFC\" \"6890\" | include \"flags\" \"up|forwardable\" | attr \"address\" }}": template: sockaddr.Parse:1:32: executing "sockaddr.Parse" at <"type,size">: wrong type for value; expected sockaddr.IfAddrs; got string
@sean- and if i push in this way i have got:
"bind_addr": "{{ GetInterfaceIP "eth1"| sort "type,size" | include "RFC" "6890" | include "flags" "up|forwardable" | attr "address" }}",
"advertise_addr_wan": "{{ GetInterfaceIP "eth0" | sort "type,size" | exclude "RFC" "6890" | include "flags" "up|forwardable" | attr "address" }}",
Error decoding '/etc/consul/conf/config.json': invalid character 'e' after object key:value pair
Comment updated. I think I copy/pasted the wrong set of examples in. The following two should work and be very close to the same in their result.
$ sockaddr eval 'GetInterfaceIP "eth0"'
$ sockaddr eval 'GetAllInterfaces | include "name" "eth0" | sort "type,size" | include "RFC" "6890" | include "flags" "up|forwardable" | attr "address"'
@sean- should i put these inside the configuration file?
$ sockaddr eval 'GetInterfaceIP "eth0"'
$ sockaddr eval 'GetAllInterfaces | include "name" "eth0" | sort "type,size" | include "RFC" "6890" | include "flags" "up|forwardable" | attr "address"'
Like
"bind_addr": "sockaddr eval 'GetInterfaceIP "eth0"'",
"advertise_addr_wan": "sockaddr eval 'GetAllInterfaces | include "name" "eth0" | sort "type,size" | include "RFC" "6890" | include "flags" "up|forwardable" | attr "address"'"
?
Not quite, sorry. A few points (being explicit for future readers):
sockaddr eval ... command is there merely for experimentation.sockaddr eval you can drop it into the actual Consul config.GetInterfaceIP "eth0" or I would use the fully-spelled out pipe-chain and wouldn't mix and match them.text/template delimiters of {{ and }}, respectively, in order to preserve backwards compatibility.So for instance, you can pick one of the two following config blocks:
"bind_addr": "{{ GetInterfaceIP \"eth0\" }}'",
"advertise_addr_wan": "{{ GetInterfaceIP \"eth0\" }}'"
or:
"bind_addr": "{{ GetAllInterfaces | include \"name\" \"eth0\" | sort \"type,size\" | include \"RFC\" \"6890\" | include \"flags\" \"up|forwardable\" | attr \"address\" }}'",
"advertise_addr_wan": "{{ GetAllInterfaces | include \"name\" \"eth0\" | sort \"type,size\" | include \"RFC\" \"6890\" | include \"flags\" \"up|forwardable\" | attr \"address\" }}'"
Also, it isn't necessary to specify both bind_addr and advertise_addr_wan because both advertise_addr_lan and advertise_addr_wan will default to the specified bind_addr. I think the simplified config block that you want is to just specify the advertise_addr using one of the two config bits above and leave it at that. No need to make it any more complicated than it already is. Setting both bind_addr and advertise_addr is still required and will be through the rest of the 0.7.X series, but expect changes before 0.8.0 is released.
Thanks @sean-
ok, two issues now;
The "{{ GetInterfaceIP \"eth0\" }}'" takes a strange ip address (10.9.2.0) which is not my eth0 ip.
from the consul wan machine, ran
curl http://localhost:8500/v1/catalog/nodes?dc=de01
and i have got
rpc error: failed to get conn: dial tcp "an ip address of a consul wan machine of remote dc" :8300: getsockopt: connection refused
on iptables i have got these rules:
````
-A INPUT -i eth1 -p tcp -m tcp --dport 8300 -j ACCEPT
-A INPUT -i eth1 -p tcp -m tcp --dport 8301 -j ACCEPT
-A INPUT -i eth1 -p udp -m udp --dport 8301 -j ACCEPT
-A INPUT -i eth1 -p tcp -m tcp --dport 8400 -j ACCEPT
-A INPUT -i eth1 -p tcp -m tcp --dport 8500 -j ACCEPT
-A INPUT -i eth1 -p tcp -m tcp --dport 8600 -j ACCEPT
-A INPUT -i eth1 -p udp -m udp --dport 8600 -j ACCEPT
-A INPUT -i eth0 -p tcp -m tcp --dport 8300 -j ACCEPT
-A INPUT -i eth0 -p tcp -m tcp --dport 8302 -j ACCEPT
-A INPUT -i eth0 -p udp -m udp --dport 8302 -j ACCEPT
````
Basically it seems that consul doesn't listen on port 8300 for the requests from wan machine.
I cannot find a solution.
@slackpad have you got an idea?
still debugging here;
basically the wan requestes works if i set "bind" address to wan port only.
if i set "bind" to 0.0.0.0 consul 0.7.4 neither starts, with error :
"Error starting agent: Failed to get advertise address: Multiple private IPs found. Please configure one"
I have tried also with
"serf_wan_bind": "ip of eth0",
"serf_lan_bind": "ip of eth1",
and nothing, still the same error.
The question is:
if i have got "bind" to private lan interface,
and "serf_wan_bind" and "advertise_addr_wan" correctly set to public ip address,
why consul members -wan works
and
curl http://localhost:8500/v1/catalog/nodes?dc=de01
gives me connection refused?
ok, i have identified the issue and got a workaround;
if i set
"bind" to PRIVATE INTERFACE (LAN)
"serf_wan_bind": "PUBLIC INTERFACE",
"serf_lan_bind": "PRIVATE INTERFACE",
NOTHING WORKS
but if i set
"bind" to PUBLIC INTERFACE (WAN)
"serf_wan_bind": "PUBLIC INTERFACE",
"serf_lan_bind": "PRIVATE INTERFACE",
everything works perfectly ( i have at the moment a lot of Refuting a suspect message) but seems everything works.
basically the RPC forwarding is not working as expected.
-serf-wan-bind doesn't allow queries from other consul wan on port 8300
Why this issue? @slackpad is this a bug?
my odissea with this issue continue....
i need to review my past comment, after some test, it i set
"bind" to PUBLIC INTERFACE (WAN)
"serf_wan_bind": "PUBLIC INTERFACE",
"serf_lan_bind": "PRIVATE INTERFACE",
the consul wan machine works, but election or local members will not work :-(
the configuration file now has got the following values:
"bind_addr": "PUBLIC IP",
"advertise_addr": "PRIVATE IP",
"serf_lan_bind": "PRIVATE IP",
"serf_wan_bind": "PUBLIC IP",
"advertise_addr_wan": "PUBLIC IP",
"translate_wan_addrs": true,
but election doesn't happen in this way.
Hello, we have much better docs these days: https://learn.hashicorp.com/consul/security-networking/datacenters and I hope they solve your problem. If thats not the case feel free to open a new issue. Thanks for reporting!
Most helpful comment
@shakisha please give the latest code in
mastera twirl and let us know. The syntax for selecting IP addresses, such as getting the first "usable" IP address on an interface, or any other manner of network craziness is likely possible now as a template evaluated address parameter can be passed to Consul addresses (e.g.-bindorbind_addr, or any other*_addr-like parameter):Very few people should need to do anything as obscene as shown in the last example, but the functionality is there should you need it. I gave this issue a quick read and it seems like you could use this configuration enhancement to select the
tun0address for thebind_addrand then use a different parameter value for theadvertise_addr. This doesn't solve your exact problem of listening on multiple interfaces or IPs, but it's on its way in the near future.With the
sockaddrcommand you can experiment with getting the right template syntax with theevalsub-command, for instance:There is now a configurable template language for examples and docs) behind this that you can use to create a customizable heuristic that should allow you to get whatever it is that you need from your environment when using an immutable image (see hashicorp/go-sockaddr/template and cmd/sockaddr.
If this doesn't solve your issue, let us know.