Openshift-ansible: Infra Selector Question

Created on 10 Feb 2016  路  17Comments  路  Source: openshift/openshift-ansible

Hey guys, we are transitioning from using our own playbooks to create a docker registry and router to using what you guys have provided and ran into a rather lame problem.

https://github.com/openshift/openshift-ansible/blob/master/playbooks/common/openshift-master/config.yml#L178

Is manually configured to look for infra nodes with specific tags. But you allow this to be configurable. So if I don't carry over what we are hardcoded to look for, I don't end up with a router nor a registry. Is this a simple oversight, or is there a larger reason as to why this is?

All 17 comments

@jtslear it was an oversight on my part.

@jtslear basically, the byo playbook assumes that 'infra' nodes will have the {region: infra} label, and that is carried through to the playbook.

On the other hand, the aws,gce, openstack and libvirt playbooks do not use the region label by default and use the {type: infra} label instead, but they override openshift_infra_nodes.

One way to achieve the same result in your inventory file would be the following:

openshift_infra_nodes="{{ hostvars | oo_select_keys(groups['oo_nodes_to_config'])  | oo_nodes_with_label('<my_label>', '<my_value>') | oo_collect('inventory_hostname') }}"

The oo_nodes_with_label does currently only support one label/value pair only though.

After testing what you noted above, I'm not able to get it to work. The way I want too at least. During testing, we have 3 nodes to dedicate to routing and 3 nodes dedicated to being application hosts. We don't previously spin up nodes tagged infra, so for the sake of testing all of these became infra nodes. What we ended up with is 3 deployed and running router pods, with 3 stuck in pending because they weren't able to be scheduled on the 3 nodes (not sure why), and then 6 registry pods........... This isn't exactly what I was looking for.

We have a particular use case. Our environment requires we separate where the routers live and where the docker registry lives. So we have our application nodes and the docker registry live together. This breaks what you guys have designated as 'infra.' Having this statically defined like this prevents us from moving forward with what you guys have rolled for the router and registry.

Feel free to close this or create a feature if you deem worthy. My work around is to simply set openshift_infra_nodes to an empty string and continue down the path we currently do.

@jtslear In that case, you should be able to use the openshift_router_selector and openshift_registry_selector variables to override the default value. You will still need to define openshift_infra_nodes to non-empty to trigger the deployment. The length of infra_nodes will affect the number of replicas created when the router and/or registry are created, but the _selector variables will override the placement.

You are correct, and just as you noted, the replicas are configured in a way that I don't find to be very legit. I end up with something like I described above where Pods try to be scheduled and fail due to the selector not matching an appropriate node and too many replicas are configured.

NAME                       READY     STATUS             RESTARTS   AGE       NODE
docker-registry-2-1t86v    1/1       Running            0          5m        js-node-001.ose.bld.f4tech.com
docker-registry-2-62ikx    1/1       Running            0          5m        js-node-001.ose.bld.f4tech.com
docker-registry-2-etkbj    1/1       Running            0          5m        js-node-001.ose.bld.f4tech.com
docker-registry-2-j90lh    1/1       Running            0          5m        js-node-001.ose.bld.f4tech.com
router-1-arddv             0/1       Pending            0          5m
router-1-g0ea7             0/1       Pending            0          5m
router-1-pbo3c             1/1       Running            0          5m        js-router-001.ose.bld.f4tech.com
router-1-s2e3a             0/1       Pending            0          5m
NAME                               LABELS                                                                                   STATUS    AGE
js-node-001.ose.bld.f4tech.com     blah=blah,kubernetes.io/hostname=js-node-001.ose.bld.f4tech.com,node=true,region=infra   Ready     19h
js-router-001.ose.bld.f4tech.com   kubernetes.io/hostname=js-router-001.ose.bld.f4tech.com,region=infra,router=true         Ready     19h

And this using the following selectors:

+openshift_router_selector: 'router=true'
+openshift_registry_selector: 'node=true'

@jtslear It sounds like what we need then is a variable to override the replicas logic for each of the hosted services.

I'm thinking something like openshift_router_replicas and openshift_registry replicas that would default to the current behavior, but allow it to be overridden if needed.

@abutcher thoughts?

@detiber I like that - when we add them we can put all of these options together in the byo inventories with some explanation

@detiber If I want to disable the automatic creation of routers and registries completely, is the proper way to override the selectors to add them to /etc/ansible/hosts with an empty value?

openshift_router_selector=''
openshift_registry_selector=''

@pcsherid removing the selectors will just allow them to be deployed to any hosts in the cluster.

The easiest way to disable the creation is to set openshift_master_infra_nodes = [], which will set deploy_infra to False in playbooks/common/openshift-cluster/additional_config.yaml and will prevent the router and the registry from being deployed.

You may also want to make sure that you aren't defining any hosts in the nfs group in the inventory and not setting any of the openshift_hosted_registry_* variables to prevent a registry pv/pvc from being created for you.

@detiber , +1 for openshift_(router|registry)_replicas. Any ETA for that? I hear our recommendation is to have dedicated router and registry nodes. It's much useful to have these two decoupled.

@akostadinov not currently, but maybe @abutcher can work it into his delegate_to PR.

@detiber @akostadinov I'd like to incorporate those changes into the work I'll be doing for https://github.com/openshift/openshift-ansible/issues/1532 which is probably the next thing I'll be picking up

@abutcher :+1: from me.

+1, and also it would be useful if there is a way to provision a router on each node with proper labeling regardless of number. i.e. given we have a router selector of "region=infra,zone=router", if environment at time_1 has 2 nodes with label "region=router,zone=router", then we have 2 routers on these nodes. If at time_2 env has 4 such nodes, then another 2 routers are started. That would make useless the need to specify num_routers or num_registries. We would just know that all nodes with proper labeling will have a router/registry.
e.g. openshift_registry_replicas=-1 would mean that we launch as many registries as there are suitable nodes

I guess that may need other changes, not only in the playbook. But will help to have less numbers to care about, thus simplifying installation and maintenance of a cluster.

@akostadinov I'd think we'd need some type of product support for that.

I know daemonsets allows for deploying per node, but I don't know if there is a way to limit it to just per node depending on labels...

After looking at http://kubernetes.io/docs/admin/daemons/ it looks like it is possible, it would just require changes to the way oadm creates the router/registry to be able to use it though.

openshift_hosted_router_replicas=2
openshift_hosted_registry_replicas=2
Was this page helpful?
0 / 5 - 0 ratings