Configure the local node.
There will probably be some caveats like the certs/tokens management
For the pull mode what do you think about:
1 - Create the etcd cluster in a way that it can be scaled, we need to discuss about what would be the proper way : static or discovery.
How can we scale up in the case of an autoscaling.
2 - Use the etcd cluster to store the inventory and use a dynamic inventory.
example (https://gist.github.com/justenwalker/09698cfd6c3a6a49075b)
3 - A big issue is the secrets management, where to store the certs/tokens?. how to sync between the nodes?, do we have to create a cert by node? ...
4 - Use ansible pull-mode
For public cloud consumers, etcd discovery is probably optimal since it almost never results in a broken cluster. For anyone deploying in-house, they might be reluctant to use discovery. An initial cluster array is adequate already.
Dynamic inventory and deploying etcd via ansible creates a chicken and egg problem. You can't use an inventory from etcd until etcd is up. Also, you need to make a way to populate this etcd. I would vote against adding complexity just for the sake of finding an innovative way to consume etcd.
Secrets management is a topic I've dealt with in previous projects. We currently have 1 master host which knows all the information. If you want to move to client-pull mode, all clients need to know where host(s) are located that know the secrets. Secret file storage should be replicated and transmitted using an encrypted method (ansible's SSH/rsync transport is totally fine).
I think you should add a new role for secrets and the first alphabetical node actually generates the secrets, while the others take a fully copy. All other nodes only take the secrets as-needed. It's important to ensure that scale-up/scale-down scenarios are covered.
Thank you mattymo for your answer.
We can let the user choose the way he wants to deploy the etcd cluster.
The pull mode would just be an option.
I understand that etcd would become a strong dependency but when for instance a new node is added he needs to know about the cluster topology (where is the api, the etcd, ...).
If you think about another option, we can evaluate it too.
Regarding the secrets,
all clients need to know where host(s) are located that know the secrets.
This is the reason why we need an inventory
What you describe is exactly current kargo's behaviour and it works just fine.
We can probably keep it if we have an inventory somewhere (e.g. etcd).
i'm probably missing something, but why not consider DNS discovery with SRV records vs. etcd discovery?
This is one of the discovery option that offers etcd and i'm actually considering it @v1k0d3n
@rustyrobot , i need your input here too :)
it's always been the easiest for me when building and tearing down etcd clusters for kubernetes during testing (granted, i've been pulled away from doing this in recent months so some of the syntax may have changed with etcd2/3).
i just created srv records on my dns server:
; Kubernetes ETCD Server Cluster Information
_etcd-server._tcp.domain.com. 300 IN SRV 0 0 2380 kubetcd01.domain.com.
_etcd-server._tcp.domain.com. 300 IN SRV 0 0 2380 kubetcd02.domain.com.
_etcd-server._tcp.domain.com. 300 IN SRV 0 0 2380 kubetcd03.domain.com.
_etcd-server._tcp.domain.com. 300 IN SRV 0 0 2380 kubetcd04.domain.com.
_etcd-server._tcp.domain.com. 300 IN SRV 0 0 2380 kubetcd05.domain.com.
; Kubernetes ETCD Client Cluster Information
_etcd-client._tcp.domain.com. 300 IN SRV 0 0 2379 kubetcd01.domain.com.
_etcd-client._tcp.domain.com. 300 IN SRV 0 0 2379 kubetcd02.domain.com.
_etcd-client._tcp.domain.com. 300 IN SRV 0 0 2379 kubetcd03.domain.com.
_etcd-client._tcp.domain.com. 300 IN SRV 0 0 2379 kubetcd04.domain.com.
_etcd-client._tcp.domain.com. 300 IN SRV 0 0 2379 kubetcd05.domain.com.
; 10.1.1.0/24 - A Records: Kubernetes/Etcd Members
kubetcd01 IN A 10.1.1.21
kubetcd02 IN A 10.1.1.22
kubetcd03 IN A 10.1.1.23
kubetcd04 IN A 10.1.1.24
kubetcd05 IN A 10.1.1.25
and then configure the etcd cluster for dns discovery (example for kubetcd01)...
# [member]
ETCD_NAME=kubetcd01
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_SNAPSHOT_COUNTER="1000"
ETCD_ELECTION_TIMEOUT="1000"
ETCD_LISTEN_CLIENT_URLS="http://127.0.0.1:2379,http://127.0.0.1:4001,http://kubetcd01.domain.com:2379,http://kubetcd01.domain.com:4001"
ETCD_LISTEN_PEER_URLS="http://kubetcd01.domain.com:2380"
#[cluster]
ETCD_DISCOVERY_SRV="domain.com"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://kubetcd01.domain.com:2380"
ETCD_INITIAL_CLUSTER_TOKEN="domain-etcd"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_ADVERTISE_CLIENT_URLS="http://kubetcd01.domain.com:2379,http://kubetcd02.domain.com:2379,http://kubetcd03.domain.com:2379,http://kubetcd04.domain.com:2379,http://kubetcd05.domain.com:2379"
I would not use ansible pull if possible. Also about the all in one image
there are 2 images:
1 with de deployment scripts. 1 with all tools
I ll detail more later
Le 2 juil. 2016 14:50, "Brandon B. Jozsa" [email protected] a
écrit :
it's always been the easiest for me, and i bring up and tear down etcd
clusters for kubernetes all the time (granted, i've been pulled away from
doing this in recent months so some of the syntax may have changed with
etcd2/3).i just created srv records on my dns server:
; Kubernetes ETCD Server Cluster Information
_etcd-server._tcp.domain.com. 300 IN SRV 0 0 2380 kubetcd01.domain.com.
_etcd-server._tcp.domain.com. 300 IN SRV 0 0 2380 kubetcd02.domain.com.
_etcd-server._tcp.domain.com. 300 IN SRV 0 0 2380 kubetcd03.domain.com.
_etcd-server._tcp.domain.com. 300 IN SRV 0 0 2380 kubetcd04.domain.com.
_etcd-server._tcp.domain.com. 300 IN SRV 0 0 2380 kubetcd05.domain.com.; Kubernetes ETCD Client Cluster Information
_etcd-client._tcp.domain.com. 300 IN SRV 0 0 2379 kubetcd01.domain.com.
_etcd-client._tcp.domain.com. 300 IN SRV 0 0 2379 kubetcd02.domain.com.
_etcd-client._tcp.domain.com. 300 IN SRV 0 0 2379 kubetcd03.domain.com.
_etcd-client._tcp.domain.com. 300 IN SRV 0 0 2379 kubetcd04.domain.com.
_etcd-client._tcp.domain.com. 300 IN SRV 0 0 2379 kubetcd05.domain.com.and then configure the etcd cluster for dns discovery (example for
kubetcd01)...[member]
ETCD_NAME=kubetcd01
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_SNAPSHOT_COUNTER="1000"
ETCD_ELECTION_TIMEOUT="1000"
ETCD_LISTEN_CLIENT_URLS="http://127.0.0.1:2379,http://127.0.0.1:4001,http://kubetcd01.domain.com:2379,http://kubetcd01.domain.com:4001"
ETCD_LISTEN_PEER_URLS="http://kubetcd01.domain.com:2380"[cluster]
ETCD_DISCOVERY_SRV="jinkit.com"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://kubetcd01.domain.com:2380"
ETCD_INITIAL_CLUSTER_TOKEN="domain-etcd"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_ADVERTISE_CLIENT_URLS="http://kubetcd01.domain.com:2379,http://kubetcd02.domain.com:2379,http://kubetcd03.domain.com:2379,http://kubetcd04.domain.com:2379,http://kubetcd05.domain.com:2379"—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/kubespray/kargo/issues/320#issuecomment-230100265,
or mute the thread
https://github.com/notifications/unsubscribe/AA_vbBdsxWn2jp3XpNn5eMOMi3uDq03Mks5qRl6jgaJpZM4JB0LA
.
@ant31 yes, please do :)
@v1k0d3n how would you delete or add members ?
i would let the users control that on the DNS side, and use proxy on the etcd members. destroy, and/or rebuild...add via dns. i mean, it's RAFT...so 3 or 5 members is ideal. how many members do you really want over that? my biggest stumbling block right now with kargo is I have this great srv/dns framework in place that i can't use to bring up etcd. :(
@Smana @v1k0d3n What I know about current state of etcd, is there is no other way to manage it except of having a static list of etcd members and synchronizing it with a etcd cluster by explicitly calling etcd member add node and etcd member remove node, also documentation explicitly states that discovery should be used only for cluster bootstrapping, after cluster is created, discovery becomes kind off useless. Also public discovery is not always an option, when we are talking about data centers with firewall in front (which may block some (all) traffic due to security reasons), in this case deploying own HA discovery system is another issue.
There is a good video on life cycle management of etcd from CoreOS fest which took place in Berlin. Basically the presenter had to reinvent new tool on top of etcd, in order to do proper cluster management, it's not a trivial task, and I would suggest to go with static list as the most simple and straight forward solution, until something like that is supported by etcd natively.
-> the all-in one is optional and that's an another subject
The only requirement on host would be docker:
The base idea is to run:
docker run -e options=... -v /:/rootfs/ --rm kargo-deploy -- init
The image kargo-deploy, contains ansible + kargo scripts, we mount the host volume into the container, with privileged access we can configure it.
To configure the container_engine, I propose to keep and use current playbooks. Maybe later we can switch to shell-script instead of ansible to remove the 'python' dependency from hosts.
i think i'm losing track of what's being discussed in this thread, which is why I started #324 @Smana. giving users an option for how _they_ want to bootstrap etcd distances ourselves away from _which method_ is better and why. in my use case; i'm very specifically looking for a DNS SRV bootstrap discovery method for etcd, and i like the approach of "bring your own [xyz component]" to the project.
if users are tied to hard dependancies like ansible-pull, kpm built-in, etc, or if the project becomes less democratic and more opinionated about the etcd bootstrap method i feel like the target audience will become more narrow over time.
Discussion have deviated on etcd, maybe we should open a new issue to solve this question of etcd.
This issue is to how to switch kargo from push to pull.
The idea is to:
if users are tied to hard dependancies like ansible-pull, kpm built-in, etc,
That's the opposite of what we trying to solve with this issue.
We want that the only requirement on hosts is aa 'container engine' (docker/rkt)
This is why using ansible-pull is out! I don't want to install ansible on every hosts.
We have to find something else than ansible-pull (shell-script?) to deploy the kargo image
@ant31 i agree with you for the docker image which deploys the node where it resides.
Actually that's why i've opened the issue https://github.com/kubespray/kargo/issues/321
That said how would you configure the local node without using the pull mode (inside the container of course) ?
The main issue is not to run inside a container, this is easy to do but how to configure the local node.
The pull is not mandatory in the case of a docker image (the ansible playbooks is inside the container) but we need to get the inventory from somewhere and automatically (when the node starts)
Maybe I'm missing something, but why not stand up an etcd cluster with discovery and store secrets in etcd?
@v1k0d3n yes, good idea to use the etcd cluster to store the secrets and configuration shared by nodes/masters
Please refer to https://github.com/kubespray/kargome~~
private repo?
@v1k0d3n Sorry, i've changed my mind and i closed the repo, i'll try to do a PR instead.
hello! i'm curious if this is still in the works? i'd be interested in contributing :)
@billyoung we thinking about something, but no real work is done as i know.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close
@fejta-bot: Closing this issue.
In response to this:
Rotten issues close after 30d of inactivity.
Reopen the issue with/reopen.
Mark the issue as fresh with/remove-lifecycle rotten.Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Most helpful comment
@v1k0d3n yes, good idea to use the etcd cluster to store the secrets and configuration shared by nodes/masters