Origin: About the host IP of the HA router

Created on 18 Dec 2015 · 5Comments · Source: openshift/origin

It seems that the ip failover used to setup the high availability of router, but it is said that it is keeping the virtual IP unchanged if service deployed into a new node.

I have a question that router service need a fix host IP (true?), which is used to configured in DNS server, so how to keep this IP not changed when router service is deployed to a new node?

Is this a part of HA router solution? or need to setup outside the openshift and any good suggestions if so?

Thanks.

kinquestion

Source

thincal

Most helpful comment

@thincal so if you configure a DNS entry using a node's IP address - what happens if the node goes down? You would need some sort of IP failover mechanism to ensure something (in this case a router) is always serving on that IP. If your environment is tolerant to downtime, you need to either update the DNS entry or start up a new node with that IP address.

Of course, this requires you to know when a problem happens and to be able to take the manual steps to
fix it immediately.

The HA solution is to handle these failure cases but also automate the recovery, so there's really no
manual intervention needed. In a nutshell, rather than use the node's IP address to configure the DNS entry, you would instead use VIPs (Virtual IP addresses) to handle failures. The VIPs would float between any of the nodes where ipfailover (really keepalived) is running in your OpenShift cluster and
handle failure cases when router instances die and are re-born or nodes die.

Hopefully an example with steps might make it clearer:

Let's say you have an environment with 'n' nodes and you designate 5 nodes to run your infrastructure services (n > 5 of course).
Let's further assume that the IP addresses for those 5 infrastructure nodes are 10.1.1.1 - 10.1.1.5.
1. Rather than configure your DNS entry to those 5 IP addresses (which all need a router running and which will all need to up otherwise a percentage of your requests would fail), you instead assign some virtual IPs to your environment.
  
  So let's use 3 VIPs (10.1.100.1, 10.1.100.2 and 10.1.100.3) -- note the 10.1.100.* block we are using here.
2. Configure your DNS entry for www.example.test to resolve to those 3 VIPs 10.1.100.[1-3].
3. Now we need to ensure one or more routers is always servicing those VIPs on your cluster for High
  
  Availability. Remember we had 5 infrastructure nodes we had allocated to run the routers.
  
  Assuming these 5 infrastructure nodes are tagged with "infra=red", we can now run a service ipfailover to ensure those 3 VIPs are always on one or more of those 5 infrastructure nodes.
  
  And also monitor port 80 (that the router would run on - or you could use port 443), so that we can
  
  watch for router failures.
  
  oadm ipfailover --replicas=5 --virtual-ips="10.1.100.1-3" --selector="infra=red" --watch-port=80 ...
4. Ok, next part of the puzzle is to run the routers on these infrastructure nodes. But we don't want to
  
  use all the nodes, let's say we use 2 of these nodes and use the same selector(infra=red as above).
  
  oadm router ... --selector="infra=red" --replicas=2 # 2 router instances.
5. Of course this means only 2 router instances run on the cluster but because of the ipfailover piece, the nodes where the routers are running would get one or more of the 3 Virtual IPs 10.1.100.[1-3] allocated to them. Now, since the router binds to 0.0.0.0 - with some kernel magic its binding on the virtual IP address as well and so serving requests on that VIP.
6. This is all good in steady state. When a failure occurs - could be a node dying off, a router crashing, OOM killing the router etc, the keepaliveds clustered together (they do need multicast as they send VRRP advertisements) would notice a router is down.
  
  Either the local node keepalived notices nothing bound on port 80/443 or a remote keepalived notices a peer is down. In either case, one of the peer keepaliveds (which has a router instance running on its node) would pick up the VIP that now service it. And thanks to the afore mentioned magic, since the router is binding to 0.0.0.0, it automatically starts servicing the newly added VIP on its node.
7. As a result, the requests would always be serviced on those VIPs even if there is a failure.
8. A similar scenario - VIPs floating over to another node in the cluster happens when a new router instance joins the fray - example because of a node failure, OpenShift/Kubernetes notices a pod is missing and starts up a replacement router pod on another of those 5 infrastructure nodes.
  
  The keepaliveds would notice a new router in the mix and float over the VIPs if needed to that node.

Now, that said, you can also do this with a similar approach (non VIP oriented solutions) outside of OpenShift - example, run a highly available external load balancer service and point it at your routers.
Example: if you are already running on one of the cloud providers ala Amazon, Rackspace, GCE you could use their load balancers (ala ELB on Amazon EC2) to divert traffic to the OpenShift router.

HTH

ramr on 24 Dec 2015

👍2 ❤1

All 5 comments

@kargakis or @liggitt could you help with me about this question ? many thanks :)

thincal on 21 Dec 2015

@deads2k do you know who can help with this question ? thanks.

thincal on 23 Dec 2015

@ramr @pweil-

kargakis on 23 Dec 2015

Of course, this requires you to know when a problem happens and to be able to take the manual steps to
fix it immediately.

Hopefully an example with steps might make it clearer:

Let's say you have an environment with 'n' nodes and you designate 5 nodes to run your infrastructure services (n > 5 of course).
Let's further assume that the IP addresses for those 5 infrastructure nodes are 10.1.1.1 - 10.1.1.5.
1. Rather than configure your DNS entry to those 5 IP addresses (which all need a router running and which will all need to up otherwise a percentage of your requests would fail), you instead assign some virtual IPs to your environment.
  
  So let's use 3 VIPs (10.1.100.1, 10.1.100.2 and 10.1.100.3) -- note the 10.1.100.* block we are using here.
2. Configure your DNS entry for www.example.test to resolve to those 3 VIPs 10.1.100.[1-3].
3. Now we need to ensure one or more routers is always servicing those VIPs on your cluster for High
  
  Availability. Remember we had 5 infrastructure nodes we had allocated to run the routers.
  
  Assuming these 5 infrastructure nodes are tagged with "infra=red", we can now run a service ipfailover to ensure those 3 VIPs are always on one or more of those 5 infrastructure nodes.
  
  And also monitor port 80 (that the router would run on - or you could use port 443), so that we can
  
  watch for router failures.
  
  oadm ipfailover --replicas=5 --virtual-ips="10.1.100.1-3" --selector="infra=red" --watch-port=80 ...
4. Ok, next part of the puzzle is to run the routers on these infrastructure nodes. But we don't want to
  
  use all the nodes, let's say we use 2 of these nodes and use the same selector(infra=red as above).
  
  oadm router ... --selector="infra=red" --replicas=2 # 2 router instances.
5. Of course this means only 2 router instances run on the cluster but because of the ipfailover piece, the nodes where the routers are running would get one or more of the 3 Virtual IPs 10.1.100.[1-3] allocated to them. Now, since the router binds to 0.0.0.0 - with some kernel magic its binding on the virtual IP address as well and so serving requests on that VIP.
6. This is all good in steady state. When a failure occurs - could be a node dying off, a router crashing, OOM killing the router etc, the keepaliveds clustered together (they do need multicast as they send VRRP advertisements) would notice a router is down.
  
  Either the local node keepalived notices nothing bound on port 80/443 or a remote keepalived notices a peer is down. In either case, one of the peer keepaliveds (which has a router instance running on its node) would pick up the VIP that now service it. And thanks to the afore mentioned magic, since the router is binding to 0.0.0.0, it automatically starts servicing the newly added VIP on its node.
7. As a result, the requests would always be serviced on those VIPs even if there is a failure.
8. A similar scenario - VIPs floating over to another node in the cluster happens when a new router instance joins the fray - example because of a node failure, OpenShift/Kubernetes notices a pod is missing and starts up a replacement router pod on another of those 5 infrastructure nodes.
  
  The keepaliveds would notice a new router in the mix and float over the VIPs if needed to that node.