On running a deploy_cluster playbook, there is an error creating the default registry service. This didn't happen before I added openshift_master_cluster_public_hostname to the inventory/hosts.localhost file (right now it's a single master cluster).
Please put the following version information in the code block
indicated below.
ansible --versionIf you're operating from a git clone:
git describeIf you're running from playbooks installed via RPM
rpm -q openshift-ansiblePlace the output between the code block below:
ansible 2.5.2
config file = /etc/ansible/ansible.cfg
configured module search path = [u'/root/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
ansible python module location = /usr/lib/python2.7/site-packages/ansible
executable location = /usr/bin/ansible
python version = 2.7.15 (default, May 16 2018, 17:50:09) [GCC 8.1.1 20180502 (Red Hat 8.1.1-1)]
[~/openshift-ansible]$ git describe *[release-3.9]
openshift-ansible-3.9.30-1-14-gb17f21b5a
Expected cluster to be deployed
Describe what is actually happening.
TASK [openshift_hosted : create the default registry service] ***************************************************************************
fatal: [localhost]: FAILED! => {"changed": false, "module_stderr": "", "module_stdout": "keys are not equal in dict\n{'ports', 'type', 's
elector', 'sessionAffinity'}\n{'ports', 'sessionAffinity', 'sessionAffinityConfig', 'type', 'selector'}\n\n{\"changed\": true, \"results\
": {\"returncode\": 0, \"cmd\": \"/usr/bin/oc get service docker-registry -o json -n default\", \"results\": [{\"apiVersion\": \"v1\", \"
kind\": \"Service\", \"metadata\": {\"creationTimestamp\": \"2018-06-08T19:04:21Z\", \"name\": \"docker-registry\", \"namespace\": \"defa
ult\", \"resourceVersion\": \"473253\", \"selfLink\": \"/api/v1/namespaces/default/services/docker-registry\", \"uid\": \"c03c33eb-6b4e-1
1e8-83a7-ac1f6b45c3f0\"}, \"spec\": {\"clusterIP\": \"172.30.192.95\", \"ports\": [{\"name\": \"5000-tcp\", \"port\": 5000, \"protocol\":
\"TCP\", \"targetPort\": 5000}], \"selector\": {\"docker-registry\": \"default\"}, \"sessionAffinity\": \"ClientIP\", \"sessionAffinityC
onfig\": {\"clientIP\": {\"timeoutSeconds\": 10800}}, \"type\": \"ClusterIP\"}, \"status\": {\"loadBalancer\": {}}}], \"clusterip\": \"17
2.30.192.95\"}, \"state\": \"present\", \"invocation\": {\"module_args\": {\"namespace\": \"default\", \"name\": \"docker-registry\", \"p
orts\": [{\"name\": \"5000-tcp\", \"port\": 5000, \"protocol\": \"TCP\", \"targetPort\": 5000}], \"selector\": {\"docker-registry\": \"de
fault\"}, \"session_affinity\": \"ClientIP\", \"service_type\": \"ClusterIP\", \"clusterip\": \"\", \"kubeconfig\": \"/etc/origin/master/
admin.kubeconfig\", \"state\": \"present\", \"debug\": false, \"annotations\": null, \"labels\": null, \"portalip\": null, \"external_ips
\": null}}}\n", "msg": "MODULE FAILURE", "rc": 0}
For long output or logs, consider using a gist
$ cat /etc/redhat-release)[~/openshift-ansible]$ cat /etc/redhat-release *[release-3.9]
Fedora release 28 (Twenty Eight)
[~/openshift-ansible]$ cat inventory/hosts.localhost
#bare minimum hostfile
[OSEv3:children]
masters
nodes
etcd
[OSEv3:vars]
# if your target hosts are Fedora uncomment this
ansible_python_interpreter=/usr/bin/python3
openshift_deployment_type=origin
openshift_version=3.9.0
openshift_release="3.9.0"
openshift_pkg_version=-3.9.0
osm_cluster_network_cidr=10.128.0.0/14
openshift_portal_net=172.30.0.0/16
osm_host_subnet_length=9
openshift_enable_excluders=false
# localhost likely doesn't meet the minimum requirements
openshift_disable_check=disk_availability,memory_availability
# use firewalld, it's bugged on Atomic Host but not normal spin
os_firewall_use_firewalld=true
# htpasswd auth
# Defining htpasswd users
openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider', 'filename': '/etc/origin/master/htpasswd' }]
openshift_node_kubelet_args={'cgroup-driver':['cgroupfs']}
openshift_master_named_certificates=[{"certfile": "/etc/secrets/cevn.pem", "keyfile": "/etc/secrets/cevn.key", "names": ["openshift.cevn.io"]}]
openshift_master_cluster_method=native
#openshift_master_cluster_hostname=openshift.cevn.io
openshift_master_cluster_public_hostname=openshift.cevn.io
openshift_hosted_router_certificate={"certfile": "/etc/secrets/cevn.pem", "keyfile": "/etc/secrets/cevn.key", "cafile": "/etc/secrets/cevn.ca"}
openshift_master_default_subdomain=cevn.io
[masters]
localhost ansible_connection=local
[etcd]
localhost ansible_connection=local
[nodes]
localhost ansible_connection=local openshift_schedulable=true openshift_node_labels="{'region': 'infra', 'zone': 'default'}"
Fedora 28
Origin 3.9
I ran into the same issue. I'm guessing this is due to create the default registry service play not being idempotent.
Here's the play:
- name: create the default registry service
oc_service:
namespace: "{{ openshift_hosted_registry_namespace }}"
name: "{{ openshift_hosted_registry_name }}"
ports:
- name: 5000-tcp
port: 5000
protocol: TCP
targetPort: 5000
selector:
docker-registry: default
session_affinity: ClientIP
service_type: ClusterIP
clusterip: '{{ openshift_hosted_registry_clusterip | default(omit) }}
Here's the relevant check in oc_service.py:
# before passing ensure keys match
api_values = set(value.keys()) - set(skip)
user_values = set(user_def[key].keys()) - set(skip)
if api_values != user_values:
if debug:
print("keys are not equal in dict")
print(user_values)
print(api_values)
return False
api_values contains sessionAffinityConfig but it's not in user_values or play
I was able to get around this by deleting the service prior to rerunning deploy_cluster.yml
oc delete service docker-registry
This impacts openshift-ansible-3.10-* as well.
If this is still currently broken for 3.10 is there at least a workaround in the meantime that can be used to get past it and complete an installation?
I was able to get it to work by changing
to
skip = ['metadata', 'status', 'sessionAffinityConfig']
I'm more than happy to submit PR if it works for you as well
@nagonzalez I tried your change and it worked. I am able to run the playbook as many times as I want and it doesn't produce this error.
This is still broken in openshift-ansible-3.11-*.
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
/lifecycle rotten
/remove-lifecycle stale
Rotten issues close after 30d of inactivity.
Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.
/close
@openshift-bot: Closing this issue.
In response to this:
Rotten issues close after 30d of inactivity.
Reopen the issue by commenting
/reopen.
Mark the issue as fresh by commenting/remove-lifecycle rotten.
Exclude this issue from closing again by commenting/lifecycle frozen./close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Most helpful comment
@nagonzalez I tried your change and it worked. I am able to run the playbook as many times as I want and it doesn't produce this error.