Kops: separate instancegroup for etcd

Created on 1 Nov 2016 · 10Comments · Source: kubernetes/kops

@justinb proposed specifying instancegroups in kops for etcd clusters rather than zones in #732. The driving goal is to provide a level of resilience for kubernetes masters "and cope with clouds where the zones concept doesn't necessarily apply" (justinb).

I imagine this looking something like kops --num-etcds=3 --num-nodes 10 --num-masters 2 clustername.example.com

I cannot hope to one-up Kelsey's elegance in stating why:

The etcd lifecycle is not tied to Kubernetes. We should be able to upgrade etcd independently of Kubernetes.

Scaling out etcd is different than scaling out the Kubernetes Control Plane.

Prevent other applications from taking up resources (CPU, Memory, I/O) required by etcd.

To build on Kelsey's argument, something like the example command above would give us resilient etcd with a lot of upgrade and maintenance options and resilient masters.

As a long-term "vision", picture kops upgrading etcd by standing up a new etcd cluster, migrating the data, and switching the masters to the new endpoint. Minimizing disturbances in the Kubernetes API from casual operational hiccups (upgrades, "EC2-Roulette", more) becomes more and more important when there are multiple teams doing very frequent updates/deploys to that cluster; think pipelines in the plural.

@chrislovecnm mentions concerns about cost and complexity as a counter-argument in #732. It would be helpful to enumerate the additional costs and complexity here for discussion.

lifecyclfrozen lifecyclstale

Source

watkinsv-hp

👍3

Most helpful comment

@chrislovecnm My comments are based on experience supporting people running etcd in production while employed at CoreOS (where we also ran etcd in production). Just like any database you have to consider the tradeoff of running your database on the same machines as the application(s) accessing them.

In some cases you can get away with running the database side by side on the same machine, in others you'll want to move the database to a dedicated machine to help manage load and meet SLAs. You can boil this down to a matter of preference, but some situations force people to prefer etcd running on a different set of machines because:

Resource requirements (dedicated machines means dedicate memory, cpu, disk, etc)
Security requirements (dedicated machines means limited "physical" access and separation of concerns)

At the end of the day operations is about trade offs, and there will come a point in time when some people will need to partition the Kubernetes storage layer, and in some cases, depending on load, will require multiple etcd clusters (that's why we allow you to split what objects are stored where).

This project is free to offer strong opinions regarding how to do things, which is fine, but our fellow sysadmins will need help when those opinions no longer fit their situation.

kelseyhightower on 3 Nov 2016

👍11

All 10 comments

I would like this for production worthy deploys (or at least the option)

starkers on 2 Nov 2016

Who has this setup in production? Who can actually give us non-anecdotal data that this setup actually provides value?

chrislovecnm on 2 Nov 2016

The number of production deploys with this may be low because kops provides such great value out of the box. I'm betting a lot of people are willing to trade off for the excellent convenience of deploying Kubernetes clusters without having to write their own automation. Would it be better to weigh the pros and cons? Respectfully, should we take a closer look at some of the "whys?" behind your reservations?

watkinsv-hp on 2 Nov 2016

Happy to setup a time on our next kops call. I love simple when it comes to DevOps. There are two sides that well etcd on another box is simpler. And the other side is etcd on the master with the API server is simpler.

I have run a cluster with 1009 nodes, single master with etcd on it. Was for a demo, not anything super heavy production wise, but 999 pets where happy chugging along.

I love data, people that have done this in the wild, a justification on why we want to do this, rather than oh I read a git repo that did this install thingy and now this is what I want to do.

I do not want to have to wake up at 2 am because my pager go off... Aka my cell goes nuts and my wife wakes me. So let put our head together, and come up with something awesome.

Who can actually speak to this with data, and experience? What ideas do we have? What is driving this consideration?

chrislovecnm on 3 Nov 2016

Resource requirements (dedicated machines means dedicate memory, cpu, disk, etc)
Security requirements (dedicated machines means limited "physical" access and separation of concerns)

This project is free to offer strong opinions regarding how to do things, which is fine, but our fellow sysadmins will need help when those opinions no longer fit their situation.

kelseyhightower on 3 Nov 2016

👍11

@kelseyhightower appreciate the incite. I do like the use cases that you bring up, and I am wondering who can help us with more information of what has occurred in a production environment. What has worked well, and what has fallen down.

To me, it comes down to how to run etcd and the kubernetes components well. Frankly, I do not have enough information to make an educated opinion that I would be happy with. My gut says one thing, but I would love the community to assist in making an informed decision.

Architecturally the changes are doable, but not trivial. The install process would need to be tweaked, and discovery would be modified as well.

chrislovecnm on 3 Nov 2016

@chrislovecnm / @watkinsv-hp We run multiple kubernetes clusters in AWS across AZs with multi-node etcd clusters.

I wrote a service to help us manage multi-AZ etcd cluster in AWS: https://github.com/UKHomeOffice/smilodon, we had zero issues since. In principle, smilodon is very simple how it works. I guess something like smilodon could be applied in kops.

vaijab on 22 Dec 2016

We should not push this to end-users before it is end-to-end tested. kops is likely to take the lead here (we could enable it for e2e testing via a feature flag). However, the scalability team has not yet recommended it AFAIK.

@vaijab smildon looks cool - we actually have something pretty similar (protokube) which is how kops manages etcd, based on the CoreOS recommendations: https://github.com/coreos/etcd/issues/5418 We went with DNS remapping so that it could be ported to other clouds / bare metal, but looks very similar :-)

justinsb on 28 Dec 2016

Issues go stale after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale