I am trying to install OpenShift 3.10 on 3-VMs (master, worker node and infra node)
Deployment of cluster is failing with Error:
- openshift v3.10.34
- ansible --version
ansible 2.4.6.0
config file = /etc/ansible/ansible.cfg
configured module search path = [u'/root/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
ansible python module location = /usr/lib/python2.7/site-packages/ansible
executable location = /usr/bin/ansible
python version = 2.7.5 (default, Feb 20 2018, 09:19:12) [GCC 4.8.5 20150623 (Red Hat 4.8.5-28)]
- openshift-ansbile
[root@master log]# rpm -qa | grep openshift-ansible
openshift-ansible-3.10.47-1.git.0.95bc2d2.el7_5.noarch
openshift-ansible-roles-3.10.47-1.git.0.95bc2d2.el7_5.noarch
openshift-ansible-playbooks-3.10.47-1.git.0.95bc2d2.el7_5.noarch
openshift-ansible-docs-3.10.47-1.git.0.95bc2d2.el7_5.noarch
Set up details:
Having 3 VMs with RHEL 7.5 install on all (master, worker node, infra node).
# Create an OSEv3 group that contains the masters, nodes, and etcd groups
[OSEv3:children]
masters
nodes
etcd
# Set variables common for all OSEv3 hosts
[OSEv3:vars]
# SSH user, this user should allow ssh based auth without requiring a password
ansible_ssh_user=root
#os_firewall_use_firewalld=True
openshift_disable_check=docker_image_availability
# If ansible_ssh_user is not root, ansible_become must be set to true
#ansible_become=true
openshift_deployment_type=openshift-enterprise
#oreg_url=rkdomain.test/openshift3/ose-${component}:${version}
#openshift_myworks_modify_imagestreams=true
# uncomment the following to enable htpasswd authentication; defaults to DenyAllPasswordIdentityProvider
openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider'}]
# host group for masters
[masters]
master.rkdomain.test
# host group for etcd
[etcd]
master.rkdomain.test
# host group for nodes, includes region info
[nodes]
master.rkdomain.test openshift_node_group_name='node-config-master'
nodeone.rkdomain.test openshift_node_group_name='node-config-compute'
infranode.rkdomain.test openshift_node_group_name='node-config-infra'
[root@master playbooks]# ansible-playbook -vvv deploy_cluster.yml
Failed with error
Installation should happen without any issues.
[root@master playbooks]# ansible-playbook -vvv deploy_cluster.yml
TASK [openshift_control_plane : Report control plane errors msg=Control plane pods didn't come up] ***********************************
task path: /usr/share/ansible/openshift-ansible/roles/openshift_control_plane/tasks/main.yml:215
fatal: [master.rkdomain.test]: FAILED! => {
"changed": false,
"failed": true,
"msg": "Control plane pods didn't come up"
}
NO MORE HOSTS LEFT *******************************************************************************************************************
to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.retry
PLAY RECAP ***************************************************************************************************************************
infranode.rkdomain.test : ok=24 changed=2 unreachable=0 failed=0
localhost : ok=13 changed=0 unreachable=0 failed=0
master.rkdomain.test : ok=226 changed=34 unreachable=0 failed=1
nodeone.rkdomain.test : ok=24 changed=2 unreachable=0 failed=0
INSTALLER STATUS *********************************************************************************************************************
Initialization : Complete (0:00:36)
Health Check : Complete (0:02:44)
Node Bootstrap Preparation : Complete (0:00:02)
etcd Install : Complete (0:01:36)
Master Install : In Progress (0:25:27)
This phase can be restarted by running: playbooks/openshift-master/config.yml
Failure summary:
1. Hosts: master.rkdomain.test
Play: Configure masters
Task: Report control plane errors
Message: Control plane pods didn't come up
[root@master playbooks]#
For long output or logs, consider using a gist
Extracts from /var/log/ansible.log file:
2018-09-22 08:15:23,114 p=30768 u=root | Using module file /usr/share/ansible/openshift-ansible/roles/lib_openshift/library/oc_obj.py
2018-09-22 08:15:24,001 p=30768 u=root | FAILED - RETRYING: Wait for control plane pods to appear (5 retries left).Result was: {
"attempts": 56,
"changed": false,
"failed": true,
"invocation": {
"module_args": {
"all_namespaces": null,
"content": null,
"debug": false,
"delete_after": false,
"field_selector": null,
"files": null,
"force": false,
"kind": "pod",
"kubeconfig": "/etc/origin/master/admin.kubeconfig",
"name": "master-controllers-master.rkdomain.test",
"namespace": "kube-system",
"selector": null,
"state": "list"
}
},
"msg": {
"cmd": "/usr/bin/oc get pod master-controllers-master.rkdomain.test -o json -n kube-system",
"results": [
{}
],
"returncode": 1,
"stderr": "The connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?\n",
"stdout": ""
},
"retries": 61
}
2018-09-22 08:15:29,003 p=30768 u=root | Using module file /usr/share/ansible/openshift-ansible/roles/lib_openshift/library/oc_obj.py
2018-09-22 08:15:29,721 p=30768 u=root | FAILED - RETRYING: Wait for control plane pods to appear (4 retries left).Result was: {
"attempts": 57,
"changed": false,
"failed": true,
"invocation": {
"module_args": {
"all_namespaces": null,
"content": null,
"debug": false,
"delete_after": false,
"field_selector": null,
"files": null,
"force": false,
"kind": "pod",
"kubeconfig": "/etc/origin/master/admin.kubeconfig",
"name": "master-controllers-master.rkdomain.test",
"namespace": "kube-system",
"selector": null,
"state": "list"
}
},
"msg": {
"cmd": "/usr/bin/oc get pod master-controllers-master.rkdomain.test -o json -n kube-system",
"results": [
{}
],
"returncode": 1,
"stderr": "The connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?\n",
"stdout": ""
},
"retries": 61
}
2018-09-22 08:15:34,722 p=30768 u=root | Using module file /usr/share/ansible/openshift-ansible/roles/lib_openshift/library/oc_obj.py
2018-09-22 08:15:35,605 p=30768 u=root | FAILED - RETRYING: Wait for control plane pods to appear (3 retries left).Result was: {
"attempts": 58,
"changed": false,
"failed": true,
"invocation": {
"module_args": {
"all_namespaces": null,
"content": null,
"debug": false,
"delete_after": false,
"field_selector": null,
"files": null,
"force": false,
"kind": "pod",
"kubeconfig": "/etc/origin/master/admin.kubeconfig",
"name": "master-controllers-master.rkdomain.test",
"namespace": "kube-system",
"selector": null,
"state": "list"
}
},
"msg": {
"cmd": "/usr/bin/oc get pod master-controllers-master.rkdomain.test -o json -n kube-system",
"results": [
{}
],
"returncode": 1,
"stderr": "The connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?\n",
"stdout": ""
},
"retries": 61
}
2018-09-22 08:15:40,604 p=30768 u=root | Using module file /usr/share/ansible/openshift-ansible/roles/lib_openshift/library/oc_obj.py
2018-09-22 08:15:41,587 p=30768 u=root | FAILED - RETRYING: Wait for control plane pods to appear (2 retries left).Result was: {
"attempts": 59,
"changed": false,
"failed": true,
"invocation": {
"module_args": {
"all_namespaces": null,
"content": null,
"debug": false,
"delete_after": false,
"field_selector": null,
"files": null,
"force": false,
"kind": "pod",
"kubeconfig": "/etc/origin/master/admin.kubeconfig",
"name": "master-controllers-master.rkdomain.test",
"namespace": "kube-system",
"selector": null,
"state": "list"
}
},
"msg": {
"cmd": "/usr/bin/oc get pod master-controllers-master.rkdomain.test -o json -n kube-system",
"results": [
{}
],
"returncode": 1,
"stderr": "The connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?\n",
"stdout": ""
},
"retries": 61
}
2018-09-22 08:15:46,591 p=30768 u=root | Using module file /usr/share/ansible/openshift-ansible/roles/lib_openshift/library/oc_obj.py
2018-09-22 08:15:47,673 p=30768 u=root | FAILED - RETRYING: Wait for control plane pods to appear (1 retries left).Result was: {
"attempts": 60,
"changed": false,
"failed": true,
"invocation": {
"module_args": {
"all_namespaces": null,
"content": null,
"debug": false,
"delete_after": false,
"field_selector": null,
"files": null,
"force": false,
"kind": "pod",
"kubeconfig": "/etc/origin/master/admin.kubeconfig",
"name": "master-controllers-master.rkdomain.test",
"namespace": "kube-system",
"selector": null,
"state": "list"
}
},
"msg": {
"cmd": "/usr/bin/oc get pod master-controllers-master.rkdomain.test -o json -n kube-system",
"results": [
{}
],
"returncode": 1,
"stderr": "The connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?\n",
"stdout": ""
},
"retries": 61
}
2018-09-22 08:15:52,676 p=30768 u=root | Using module file /usr/share/ansible/openshift-ansible/roles/lib_openshift/library/oc_obj.py
2018-09-22 08:15:53,625 p=30768 u=root | The full traceback is:
File "/tmp/ansible_LcKeXE/ansible_module_oc_obj.py", line 47, in <module>
import ruamel.yaml as yaml
2018-09-22 08:15:53,626 p=30768 u=root | failed: [master.rkdomain.test] (item=controllers) => {
"attempts": 60,
"changed": false,
"failed": true,
"invocation": {
"module_args": {
"all_namespaces": null,
"content": null,
"debug": false,
"delete_after": false,
"field_selector": null,
"files": null,
"force": false,
"kind": "pod",
"kubeconfig": "/etc/origin/master/admin.kubeconfig",
"name": "master-controllers-master.rkdomain.test",
"namespace": "kube-system",
"selector": null,
"state": "list"
}
},
"item": "controllers",
"msg": {
"cmd": "/usr/bin/oc get pod master-controllers-master.rkdomain.test -o json -n kube-system",
"results": [
{}
],
"returncode": 1,
"stderr": "The connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?\n",
"stdout": ""
}
}
2018-09-22 08:15:53,630 p=30768 u=root | ...ignoring
2018-09-22 08:15:53,654 p=30768 u=root | TASK [openshift_control_plane : Check status in the kube-system namespace _raw_params={{ openshift_client_binary }} status --config={{ openshift.common.config_base }}/master/admin.kubeconfig -n kube-system] ***
2018-09-22 08:15:53,654 p=30768 u=root | task path: /usr/share/ansible/openshift-ansible/roles/openshift_control_plane/tasks/main.yml:188
2018-09-22 08:15:53,743 p=30768 u=root | Using module file /usr/lib/python2.7/site-packages/ansible/modules/commands/command.py
2018-09-22 08:15:54,578 p=30768 u=root | fatal: [master.rkdomain.test]: FAILED! => {
"changed": true,
"cmd": [
"oc",
"status",
"--config=/etc/origin/master/admin.kubeconfig",
"-n",
"kube-system"
],
"delta": "0:00:00.256543",
"end": "2018-09-22 08:15:54.537255",
"failed": true,
"invocation": {
"module_args": {
"_raw_params": "oc status --config=/etc/origin/master/admin.kubeconfig -n kube-system",
"_uses_shell": false,
"chdir": null,
"creates": null,
"executable": null,
"removes": null,
"stdin": null,
"warn": true
}
},
"msg": "non-zero return code",
"rc": 1,
"start": "2018-09-22 08:15:54.280712",
"stderr": "The connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?\nThe connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?\nThe connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?\nThe connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?\nThe connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?\nThe connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?\nThe connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?\nThe connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?\nThe connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?\nThe connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?\nThe connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?\nThe connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?\nThe connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?\nThe connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?\nThe connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?\nThe connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?",
"stderr_lines": [
"The connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?",
"The connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?",
"The connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?",
"The connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?",
"The connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?",
"The connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?",
"The connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?",
"The connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?",
"The connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?",
"The connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?",
"The connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?",
"The connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?",
"The connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?",
"The connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?",
"The connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?",
"The connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?"
],
"stdout": "",
"stdout_lines": []
}
2018-09-22 08:15:54,578 p=30768 u=root | ...ignoring
github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:461: Failed to list *v1.Node: Get https://master.rkdomain.test:8443/api/v1/nodes?fieldSelector=metadata.name%3Dmaster.rkdomain.test&limit=500&resourceVersion=0: dial tcp 192.168.151.6:8443: getsockopt: connection refused",
"Sep 22 08:15:55 master.rkdomain.test atomic-openshift-node[18119]: E0922 08:15:55.614538 18119 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:452: Failed to list *v1.Service: Get https://master.rkdomain.test:8443/api/v1/services?limit=500&resourceVersion=0: dial tcp 192.168.151.6:8443: getsockopt: connection refused",
"Sep 22 08:15:56 master.rkdomain.test atomic-openshift-node[18119]: W0922 08:15:56.201702 18119 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d",
"Sep 22 08:15:56 master.rkdomain.test atomic-openshift-node[18119]: E0922 08:15:56.201909 18119 kubelet.go:2146] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized",
"Sep 22 08:15:56 master.rkdomain.test atomic-openshift-node[18119]: E0922 08:15:56.594113 18119 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://master.rkdomain.test:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dmaster.rkdomain.test&limit=500&resourceVersion=0: dial tcp 192.168.151.6:8443: getsockopt: connection refused",
"Sep 22 08:15:56 master.rkdomain.test atomic-openshift-node[18119]: E0922 08:15:56.611648 18119 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:461: Failed to list *v1.Node: Get https://master.rkdomain.test:8443/api/v1/nodes?fieldSelector=metadata.name%3Dmaster.rkdomain.test&limit=500&resourceVersion=0: dial tcp 192.168.151.6:8443: getsockopt: connection refused",
"Sep 22 08:15:56 master.rkdomain.test atomic-openshift-node[18119]: E0922 08:15:56.616173 18119 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:452: Failed to list *v1.Service: Get https://master.rkdomain.test:8443/api/v1/services?limit=500&resourceVersion=0: dial tcp 192.168.151.6:8443: getsockopt: connection refused"
]
}
2018-09-22 08:15:58,022 p=30768 u=root | TASK [openshift_control_plane : Report control plane errors msg=Control plane pods didn't come up] ***********************************
2018-09-22 08:15:58,022 p=30768 u=root | task path: /usr/share/ansible/openshift-ansible/roles/openshift_control_plane/tasks/main.yml:215
2018-09-22 08:15:58,112 p=30768 u=root | fatal: [master.rkdomain.test]: FAILED! => {
"changed": false,
"failed": true,
"msg": "Control plane pods didn't come up"
}
2018-09-22 08:15:58,115 p=30768 u=root | NO MORE HOSTS LEFT *******************************************************************************************************************
2018-09-22 08:15:58,117 p=30768 u=root | to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.retry
2018-09-22 08:15:58,117 p=30768 u=root | PLAY RECAP ***************************************************************************************************************************
2018-09-22 08:15:58,118 p=30768 u=root | infranode.rkdomain.test : ok=24 changed=2 unreachable=0 failed=0
2018-09-22 08:15:58,118 p=30768 u=root | localhost : ok=13 changed=0 unreachable=0 failed=0
2018-09-22 08:15:58,119 p=30768 u=root | master.rkdomain.test : ok=226 changed=34 unreachable=0 failed=1
2018-09-22 08:15:58,119 p=30768 u=root | nodeone.rkdomain.test : ok=24 changed=2 unreachable=0 failed=0
2018-09-22 08:15:58,119 p=30768 u=root | INSTALLER STATUS *********************************************************************************************************************
2018-09-22 08:15:58,125 p=30768 u=root | Initialization : Complete (0:00:36)
2018-09-22 08:15:58,126 p=30768 u=root | Health Check : Complete (0:02:44)
2018-09-22 08:15:58,127 p=30768 u=root | Node Bootstrap Preparation : Complete (0:00:02)
2018-09-22 08:15:58,127 p=30768 u=root | etcd Install : Complete (0:01:36)
2018-09-22 08:15:58,128 p=30768 u=root | Master Install : In Progress (0:25:27)
2018-09-22 08:15:58,128 p=30768 u=root | This phase can be restarted by running: playbooks/openshift-master/config.yml
2018-09-22 08:15:58,129 p=30768 u=root | Failure summary:
1. Hosts: master.rkdomain.test
Play: Configure masters
Task: Report control plane errors
Message: Control plane pods didn't come up
OS Version
[root@master playbooks]# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.5 (Maipo)
[root@master playbooks]#
the connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?\n",
"stdout": ""
Your hostnames and FQDNs should all resolve to their respective nodes.
Do the following DNS records resolve properly on your master node?
master.rkdomain.test
master
nodeone.rkdomain.test
nodeone
infranode.rkdomain.test
infranode
My DNS is working properly and resolving all hosts:
[root@master playbooks]# ping master.rkdomain.test
PING master.rkdomain.test (192.168.151.6) 56(84) bytes of data.
64 bytes from master.rkdomain.test (192.168.151.6): icmp_seq=1 ttl=64 time=0.027 ms
64 bytes from master.rkdomain.test (192.168.151.6): icmp_seq=2 ttl=64 time=0.056 ms
64 bytes from master.rkdomain.test (192.168.151.6): icmp_seq=3 ttl=64 time=0.106 ms
^C
--- master.rkdomain.test ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.027/0.063/0.106/0.032 ms
[root@master playbooks]# ping nodeone.rkdomain.test
PING nodeone.rkdomain.test (192.168.151.7) 56(84) bytes of data.
64 bytes from 192.168.151.7 (192.168.151.7): icmp_seq=1 ttl=64 time=0.260 ms
64 bytes from 192.168.151.7 (192.168.151.7): icmp_seq=2 ttl=64 time=0.182 ms
64 bytes from 192.168.151.7 (192.168.151.7): icmp_seq=3 ttl=64 time=0.332 ms
^C
--- nodeone.rkdomain.test ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2001ms
rtt min/avg/max/mdev = 0.182/0.258/0.332/0.061 ms
[root@master playbooks]# ping infranode.rkdomain.test
PING infranode.rkdomain.test (192.168.151.8) 56(84) bytes of data.
64 bytes from 192.168.151.8 (192.168.151.8): icmp_seq=1 ttl=64 time=2.43 ms
64 bytes from 192.168.151.8 (192.168.151.8): icmp_seq=2 ttl=64 time=0.250 ms
64 bytes from 192.168.151.8 (192.168.151.8): icmp_seq=3 ttl=64 time=0.387 ms
^C
--- infranode.rkdomain.test ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.250/1.023/2.432/0.997 ms
[root@master playbooks]#
Some more info from container running states and image details:
[root@master playbooks]#docker ps -a --no-trunc
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
7a1a0472c472bd30bb249c7a581b6d1ea4760046f1673427ae7e991d6ca42b2d sha256:0876b0e0420eb7c8f8a854bee2b8009d34ceaf72f94fc4be77577f287b014ab8 "/bin/bash -c '#!/bin/bash\nset -euo pipefail\nif [[ -f /etc/origin/master/master.env ]]; then\n set -o allexport\n source /etc/origin/master/master.env\nfi\nexec openshift start master api --config=/etc/origin/master/master-config.yaml --loglevel=${DEBUG_LOGLEVEL:-2}\n'" 2 minutes ago Exited (255) About a minute ago k8s_api_master-api-master.rkdomain.test_kube-system_dd7b1af9ba90c83bcb768358ee4eae82_162
26aacaaab617e11285c0b0f6d2f39528d9e2639aa5f20239969851e6a4f66ef5 sha256:0876b0e0420eb7c8f8a854bee2b8009d34ceaf72f94fc4be77577f287b014ab8 "/bin/bash -c '#!/bin/bash\nset -euo pipefail\nif [[ -f /etc/origin/master/master.env ]]; then\n set -o allexport\n source /etc/origin/master/master.env\nfi\nexec openshift start master controllers --config=/etc/origin/master/master-config.yaml --listen=https://0.0.0.0:8444 --loglevel=${DEBUG_LOGLEVEL:-2}\n'" 2 minutes ago Exited (255) 2 minutes ago k8s_controllers_master-controllers-master.rkdomain.test_kube-system_b584d10a132929806c8a1ca90cc9d2f8_179
f583598090af0b59274029967078572518031e42360745194eee00733d4d0409 registry.access.redhat.com/openshift3/ose-pod:v3.10.34 "/usr/bin/pod" 14 hours ago Up 14 hours k8s_POD_master-controllers-master.rkdomain.test_kube-system_b584d10a132929806c8a1ca90cc9d2f8_0
cca7bbcd49ce77300898961e680673572faaa32160968e61d182cd49349cb643 registry.access.redhat.com/openshift3/ose-pod:v3.10.34 "/usr/bin/pod" 14 hours ago Up 14 hours k8s_POD_master-api-master.rkdomain.test_kube-system_dd7b1af9ba90c83bcb768358ee4eae82_0
46e627be37b8084eed93bf775b392174dd8bd7b08ab3f1435ea9cd8d90bf770b sha256:43e20e9024abe06e3cccea0e2799cd50f9b9bdf54aa8812b6030a592ff45e2a9 "/bin/bash -c '#!/bin/bash\nset -euo pipefail\nif [[ -f /etc/origin/master/master.env ]]; then\n set -o allexport\n source /etc/origin/master/master.env\nfi\nexec openshift start master api --config=/etc/origin/master/master-config.yaml --loglevel=${DEBUG_LOGLEVEL:-2}\n'" 14 hours ago Exited (255) 14 hours ago k8s_api_master-api-master.rkdomain.test_kube-system_5946c1f644096161a1242b3de0ee5875_20
d7adc9fd2289b66e7b5b28adee2c61072c672aa9d84d9b60162c007dbe799c74 sha256:98217b7c89052267e1ed02a41217c2e03577b96125e923e95941ac010f209ee6 "/bin/sh -c '#!/bin/sh\nset -o allexport\nsource /etc/etcd/etcd.conf\nexec etcd\n'" 16 hours ago Up 16 hours k8s_etcd_master-etcd-master.rkdomain.test_kube-system_0601a86ffa84b8827bdfe38dd765467c_0
4fb6fbec28da3509abdd2bb497dd0a0dbac60fd8fab35a2da05993d6ef87a332 sha256:43e20e9024abe06e3cccea0e2799cd50f9b9bdf54aa8812b6030a592ff45e2a9 "/bin/bash -c '#!/bin/bash\nset -euo pipefail\nif [[ -f /etc/origin/master/master.env ]]; then\n set -o allexport\n source /etc/origin/master/master.env\nfi\nexec openshift start master controllers --config=/etc/origin/master/master-config.yaml --listen=https://0.0.0.0:8444 --loglevel=${DEBUG_LOGLEVEL:-2}\n'" 16 hours ago Up 16 hours k8s_controllers_master-controllers-master.rkdomain.test_kube-system_8e879171c85e221fb0a023e3f10ca276_0
ba3999d448825a2b4cebb26523deb778cf85ac8dfc5e3674632d27553523cf5c registry.access.redhat.com/openshift3/ose-pod:v3.10.34 "/usr/bin/pod" 16 hours ago Up 16 hours k8s_POD_master-controllers-master.rkdomain.test_kube-system_8e879171c85e221fb0a023e3f10ca276_0
09acd81c65bc7c206fbbd5910a8a6e54647eb8509adc5869f1d471dc2ff4ff0e registry.access.redhat.com/openshift3/ose-pod:v3.10.34 "/usr/bin/pod" 16 hours ago Up 16 hours k8s_POD_master-api-master.rkdomain.test_kube-system_5946c1f644096161a1242b3de0ee5875_0
b84b0840b45df01079ce2055f444ee7f5bc6b127cef77d07699d7ac511d49578 registry.access.redhat.com/openshift3/ose-pod:v3.10.34 "/usr/bin/pod" 16 hours ago Up 16 hours k8s_POD_master-etcd-master.rkdomain.test_kube-system_0601a86ffa84b8827bdfe38dd765467c_0
[root@master playbooks]# docker images --all --no-trunc
REPOSITORY TAG IMAGE ID CREATED SIZE
registry.access.redhat.com/openshift3/ose-node v3.10 sha256:21037c699d025cc350a8deaab858238fa9731913704441510ee63f82b74cd263 12 days ago 1.27 GB
registry.access.redhat.com/openshift3/ose-control-plane v3.10 sha256:43e20e9024abe06e3cccea0e2799cd50f9b9bdf54aa8812b6030a592ff45e2a9 12 days ago 789 MB
registry.access.redhat.com/openshift3/ose-control-plane v3.10.34 sha256:0876b0e0420eb7c8f8a854bee2b8009d34ceaf72f94fc4be77577f287b014ab8 4 weeks ago 789 MB
registry.access.redhat.com/openshift3/ose-pod v3.10.34 sha256:7ab1dd3408ede07b8709301ad75cd3b5ea52dd0ba3a26e8a504db97f0e613a9e 4 weeks ago 214 MB
registry.access.redhat.com/rhel7/etcd 3.2.22 sha256:98217b7c89052267e1ed02a41217c2e03577b96125e923e95941ac010f209ee6 6 weeks ago 256 MB
What I see no service is running on port: 8443. I am wondering why that service is not running. So need help from experts.
oc commands output
[root@master playbooks]# oc get nodes
The connection to the server master.rkdomain.test:8443 was refused - did you specify the right host or port?
[root@master playbooks]#
One more info:
Each of my VMs are having 2 network interface.
1. ens192: Connected to Private IP
2. ens224: Connected to Public IP
My master API service has taken public IP but start_api.go is trying to start service on private IP.
Anyway I can control all my service bindings to certain network interface.
Ahh, that's helpful. I use Vagrant to test and have to explicitly set the IP as well.
add this to [OSEv3:vars]
# Configure nodeIP in the node config
# This is needed in cases where node traffic is desired to go over an
# interface other than the default network interface.
openshift_set_node_ip=true
then add openshift_ip=x.x.x.x to each node
[nodes]
master.rkdomain.test openshift_ip=x.x.x.x openshift_node_group_name='node-config-master'
nodeone.rkdomain.test openshift_ip=x.x.x.x openshift_node_group_name='node-config-compute'
infranode.rkdomain.test openshift_ip=x.x.x.x openshift_node_group_name='node-config-infra'
Got the same problem, when i start a fresh install the control pods want launch. i see in docker after a few seconds a bash script is loading and then its gone. The api service is not running on port 8443.
[[root@chmas-l01p ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
a59c7513ec70 43e20e9024ab "/bin/bash -c '#!/..." 1 second ago Up Less than a second k8s_api_master-api-chmas-l01p.int.zone.nl_kube-system_5946c1f644096161a1242b3de0ee5875_12
aaf9caa9f285 registry.access.redhat.com/openshift3/ose-pod:v3.10.45 "/usr/bin/pod" 34 seconds ago Up 33 seconds k8s_POD_master-controllers-sscc-chmas-l01p.int.zone.nl_kube-system_8e879171c85e221fb0a023e3f10ca276_2
1391b24bacf6 registry.access.redhat.com/openshift3/ose-pod:v3.10.45 "/usr/bin/pod" 34 seconds ago Up 33 seconds k8s_POD_master-api-sscc-chmas-l01p.int.zone.nl_kube-system_5946c1f644096161a1242b3de0ee5875_2
[root@chmas-l01p ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
aaf9caa9f285 registry.access.redhat.com/openshift3/ose-pod:v3.10.45 "/usr/bin/pod" 39 seconds ago Up 38 seconds k8s_POD_master-controllers-chmas-l01p.int.zone.nl_kube-system_8e879171c85e221fb0a023e3f10ca276_2
1391b24bacf6 registry.access.redhat.com/openshift3/ose-pod:v3.10.45 "/usr/bin/pod" 39 seconds ago Up 38 seconds k8s_POD_master-api-chmas-l01p.int.zone.nl_kube-system_5946c1f644096161a1242b3de0ee5875_2
[root@chmas-l01p ~]#
@nagonzalez thank you for the help. It worked for me.
But again landed into another problem related to CloudForm
fatal: [master.rkdomain.test]: FAILED! => {
"attempts": 30,
"changed": true,
"cmd": [
"oc",
"logs",
"cloudforms-0",
"-n",
"openshift-management"
],
"delta": "0:00:00.384272",
"end": "2018-09-24 07:21:18.944020",
"failed": true,
"invocation": {
"module_args": {
"_raw_params": "oc logs cloudforms-0 -n openshift-management",
"_uses_shell": false,
"chdir": null,
"creates": null,
"executable": null,
"removes": null,
"stdin": null,
"warn": true
}
},
"rc": 0,
"start": "2018-09-24 07:21:18.559748",
"stderr": "",
"stderr_lines": [],
"stdout": "",
"stdout_lines": []
}
PLAY RECAP ***********************************************************************************************************************************************************************************************************************************
infranode.rkdomain.test : ok=1 changed=0 unreachable=0 failed=0
localhost : ok=11 changed=0 unreachable=0 failed=0
master.rkdomain.test : ok=86 changed=1 unreachable=0 failed=1
nodeone.rkdomain.test : ok=1 changed=0 unreachable=0 failed=0
INSTALLER STATUS *****************************************************************************************************************************************************************************************************************************
Initialization : Complete (0:00:24)
Management Install : In Progress (0:17:07)
This phase can be restarted by running: playbooks/openshift-management/config.yml
Monday 24 September 2018 07:21:19 -0400 (0:15:26.243) 0:17:30.594 ******
===============================================================================
Currently running PODs:
[root@master ~]# oc get pods
NAME READY STATUS RESTARTS AGE
docker-registry-1-l52dq 1/1 Running 0 5h
registry-console-1-fbsqj 1/1 Running 1 5h
router-2-wrp4f 1/1 Running 0 5h
[root@master ~]#
I do not see couldform POD being running.
Any suggestion how to over come this failure ?
Do you have the name of the play that failed?
This is failing in step: ansible-playbook -vvv openshift-management/config.yml
One more things, I needed a reboot after step: openshift-master/config.yml
Otherwise Infar Node was not getting ready, and not roles were attached to that.
Here is the output of: ansible-playbook -vvv openshift-management/config.yml
fatal: [master.rkdomain.test]: FAILED! => {
"attempts": 30,
"changed": true,
"cmd": [
"oc",
"logs",
"cloudforms-0",
"-n",
"openshift-management"
],
"delta": "0:00:00.504908",
"end": "2018-09-24 14:26:02.611125",
"failed": true,
"invocation": {
"module_args": {
"_raw_params": "oc logs cloudforms-0 -n openshift-management",
"_uses_shell": false,
"chdir": null,
"creates": null,
"executable": null,
"removes": null,
"stdin": null,
"warn": true
}
},
"rc": 0,
"start": "2018-09-24 14:26:02.106217",
"stderr": "",
"stderr_lines": [],
"stdout": "",
"stdout_lines": []
}
PLAY RECAP ***********************************************************************************************************************************************************************************************************************************
infranode.rkdomain.test : ok=1 changed=0 unreachable=0 failed=0
localhost : ok=11 changed=0 unreachable=0 failed=0
master.rkdomain.test : ok=86 changed=1 unreachable=0 failed=1
nodeone.rkdomain.test : ok=1 changed=0 unreachable=0 failed=0
INSTALLER STATUS *****************************************************************************************************************************************************************************************************************************
Initialization : Complete (0:00:25)
Management Install : In Progress (0:17:09)
This phase can be restarted by running: playbooks/openshift-management/config.yml
Monday 24 September 2018 14:26:02 -0400 (0:15:27.738) 0:17:35.108 ******
===============================================================================
openshift_management : Wait for the app to come up. May take several minutes, 30s check intervals, 30 retries ----------------------------------------------------------------------------------------------------------------------- 927.74s
/usr/share/ansible/openshift-ansible/roles/openshift_management/tasks/main.yml:89 -----------------------------------------------------------------------------------------------------------------------------------------------------------
openshift_management : Ensure the Management App is created -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 29.20s
/usr/share/ansible/openshift-ansible/roles/openshift_management/tasks/main.yml:82 -----------------------------------------------------------------------------------------------------------------------------------------------------------
openshift_nfs : Install nfs-utils ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 26.19s
/usr/share/ansible/openshift-ansible/roles/openshift_nfs/tasks/setup.yml:5 ------------------------------------------------------------------------------------------------------------------------------------------------------------------
openshift_manageiq : Configure role/user permissions ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 8.99s
/usr/share/ansible/openshift-ansible/roles/openshift_manageiq/tasks/main.yaml:45 ------------------------------------------------------------------------------------------------------------------------------------------------------------
Gathering Facts ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 4.14s
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
openshift_management : Ensure the CFME system accounts have all the required SCCs ----------------------------------------------------------------------------------------------------------------------------------------------------- 3.51s
/usr/share/ansible/openshift-ansible/roles/openshift_management/tasks/accounts.yml:12 -------------------------------------------------------------------------------------------------------------------------------------------------------
openshift_management : Ensure the CFME system accounts exist -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 3.49s
/usr/share/ansible/openshift-ansible/roles/openshift_management/tasks/accounts.yml:4 --------------------------------------------------------------------------------------------------------------------------------------------------------
Gather Cluster facts ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 3.28s
/usr/share/ansible/openshift-ansible/playbooks/init/cluster_facts.yml:27 --------------------------------------------------------------------------------------------------------------------------------------------------------------------
Initialize openshift.node.sdn_mtu ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 2.44s
/usr/share/ansible/openshift-ansible/playbooks/init/cluster_facts.yml:59 --------------------------------------------------------------------------------------------------------------------------------------------------------------------
openshift_manageiq : Create Admin and Image Inspector Service Account ----------------------------------------------------------------------------------------------------------------------------------------------------------------- 2.31s
/usr/share/ansible/openshift-ansible/roles/openshift_manageiq/tasks/main.yaml:7 -------------------------------------------------------------------------------------------------------------------------------------------------------------
get openshift_current_version --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 2.17s
/usr/share/ansible/openshift-ansible/playbooks/init/cluster_facts.yml:10 --------------------------------------------------------------------------------------------------------------------------------------------------------------------
openshift_management : Ensure the CFME system accounts have the required roles -------------------------------------------------------------------------------------------------------------------------------------------------------- 2.13s
/usr/share/ansible/openshift-ansible/roles/openshift_management/tasks/accounts.yml:21 -------------------------------------------------------------------------------------------------------------------------------------------------------
openshift_manageiq : Add Management Infrastructure project ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 1.91s
/usr/share/ansible/openshift-ansible/roles/openshift_manageiq/tasks/main.yaml:2 -------------------------------------------------------------------------------------------------------------------------------------------------------------
openshift_manageiq : Create manageiq cluster role ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 1.60s
/usr/share/ansible/openshift-ansible/roles/openshift_manageiq/tasks/main.yaml:16 ------------------------------------------------------------------------------------------------------------------------------------------------------------
openshift_management : Check if the Management Server template has been created already ----------------------------------------------------------------------------------------------------------------------------------------------- 1.57s
/usr/share/ansible/openshift-ansible/roles/openshift_management/tasks/template.yml:18 -------------------------------------------------------------------------------------------------------------------------------------------------------
Run variable sanity checks ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 1.35s
/usr/share/ansible/openshift-ansible/playbooks/init/sanity_checks.yml:14 --------------------------------------------------------------------------------------------------------------------------------------------------------------------
openshift_manageiq : Configure 3_2 role/user permissions ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 1.17s
/usr/share/ansible/openshift-ansible/roles/openshift_manageiq/tasks/main.yaml:53 ------------------------------------------------------------------------------------------------------------------------------------------------------------
Setting sebool container_manage_cgroup ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 1.09s
/usr/share/ansible/openshift-ansible/playbooks/openshift-management/private/config-sebool.yml:7 ---------------------------------------------------------------------------------------------------------------------------------------------
openshift_management : Check if the Management App PV template has been created already ----------------------------------------------------------------------------------------------------------------------------------------------- 1.07s
/usr/share/ansible/openshift-ansible/roles/openshift_management/tasks/template.yml:77 -------------------------------------------------------------------------------------------------------------------------------------------------------
openshift_nfs : Enable and start NFS services ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 1.06s
/usr/share/ansible/openshift-ansible/roles/openshift_nfs/tasks/setup.yml:26 -----------------------------------------------------------------------------------------------------------------------------------------------------------------
[root@master playbooks]#
My new inventory file looks like this:
# Create an OSEv3 group that contains the masters, nodes, and etcd groups
[OSEv3:children]
masters
nodes
etcd
# Set variables common for all OSEv3 hosts
[OSEv3:vars]
# SSH user, this user should allow ssh based auth without requiring a password
ansible_ssh_user=root
#os_firewall_use_firewalld=True
openshift_disable_check=docker_image_availability
#openshift_public_ip=192.168.151.6
openshift_set_node_ip=true
openshift_management_install_management=true
openshift_management_app_template=cfme-template
openshift_management_storage_nfs_base_dir=/nfs-share
openshift_management_storage_nfs_local_hostname=master.rkdomain.test
# If ansible_ssh_user is not root, ansible_become must be set to true
#ansible_become=true
openshift_deployment_type=openshift-enterprise
#oreg_url=rkdomain.test/openshift3/ose-${component}:${version}
#openshift_example_modify_imagestreams=true
# uncomment the following to enable htpasswd authentication; defaults to DenyAllPasswordIdentityProvider
openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider'}]
# host group for masters
[masters]
master.rkdomain.test
# host group for etcd
[etcd]
master.rkdomain.test
# host group for nodes, includes region info
[nodes]
master.rkdomain.test openshift_ip=192.168.151.6 openshift_node_group_name='node-config-master'
nodeone.rkdomain.test openshift_ip=192.168.151.7 openshift_node_group_name='node-config-compute'
infranode.rkdomain.test openshift_ip=192.168.151.8 openshift_node_group_name='node-config-infra'
My nfs share looks like:
[root@master playbooks]# showmount -e
Export list for master.rkdomain.test:
/nfs-share/cfme-db *
/nfs-share/cfme-app *
/nfs-share nodeone.rkdomain.test,master.rkdomain.test
[root@master playbooks]#
@nagonzalez Here is the play and its output that has failed:
2018-09-24 14:10:34,944 p=14908 u=root | task path: /usr/share/ansible/openshift-ansible/roles/openshift_management/tasks/main.yml:89
2018-09-24 14:10:34,945 p=14908 u=root | Monday 24 September 2018 14:10:34 -0400 (0:00:29.198) 0:02:07.369 ******
2018-09-24 14:10:35,180 p=14908 u=root | Using module file /usr/lib/python2.7/site-packages/ansible/modules/commands/command.py
2018-09-24 14:10:36,240 p=14908 u=root | FAILED - RETRYING: Wait for the app to come up. May take several minutes, 30s check intervals, 30 retries (30 retries left).Result was: {
"attempts": 1,
"changed": true,
"cmd": [
"oc",
"logs",
"cloudforms-0",
"-n",
"openshift-management"
],
"delta": "0:00:00.564535",
"end": "2018-09-24 14:10:36.141797",
"failed": false,
"invocation": {
"module_args": {
"_raw_params": "oc logs cloudforms-0 -n openshift-management",
"_uses_shell": false,
"chdir": null,
"creates": null,
"executable": null,
"removes": null,
"stdin": null,
"warn": true
}
},
"rc": 0,
"retries": 31,
"start": "2018-09-24 14:10:35.577262",
"stderr": "",
"stderr_lines": [],
"stdout": "",
"stdout_lines": []
}
Something must not be happy in the openshift-management project.
what do the events look like?
oc get events -n openshift-management
Here is the output:
[root@master ~]# oc get events -n openshift-management
LAST SEEN FIRST SEEN COUNT NAME KIND SUBOBJECT TYPE REASON SOURCE MESSAGE
3m 4h 875 cloudforms-0.15579a7b6d2e2f7b Pod Warning FailedScheduling default-scheduler 0/3 nodes are available: 2 Insufficient cpu, 2 node(s) didn't match node selector.
3m 4h 1055 cloudforms-postgresql.15579a7bdb6e08a3 PersistentVolumeClaim Normal FailedBinding persistentvolume-controller no persistent volumes available for this claim and no storage class is set
[root@master ~]#
My PVs:
[root@master ~]# oc get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
cfme-app 5Gi RWO Retain Bound openshift-management/cloudforms-server-cloudforms-0 1d
cfme-db 15Gi RWO Retain Released openshift-management/cloudforms-postgresql 1d
pv0001 5Gi RWO Recycle Bound default/nfs-claim1 1d
[root@master ~]#
My PVC:
[root@master ~]# oc get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
nfs-claim1 Bound pv0001 5Gi RWO 1d
[root@master ~]#
what are the specs of infranode.rkdomain.test?
By the looks of the error message, it appears you don't have enough CPU resources left on your node.
LAST SEEN FIRST SEEN COUNT NAME KIND SUBOBJECT TYPE REASON SOURCE MESSAGE
3m 4h 875 cloudforms-0.15579a7b6d2e2f7b Pod Warning FailedScheduling default-scheduler 0/3 nodes are available: 2 Insufficient cpu, 2 node(s) didn't match node selector.
It's requesting 1 vCPU here:
https://github.com/openshift/openshift-ansible/blob/a7bacc1ebf3da623cadee79faba5ee6e0ed71eb6/roles/openshift_management/files/templates/cloudforms/cfme-template.yaml#L986
infranode.rkdomain.test has following CPU details:
[root@infranode ~]# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 1
On-line CPU(s) list: 0
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 79
Model name: Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
Stepping: 1
CPU MHz: 2596.992
BogoMIPS: 5193.98
Hypervisor vendor: VMware
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 35840K
NUMA node0 CPU(s): 0
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc aperfmperf pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm 3dnowprefetch epb cat_l3 cdp_l3 fsgsbase smep cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts
[root@infranode ~]#
master.rkdomain.test has following CPU details:
[root@master ~]# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 4
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 79
Model name: Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
Stepping: 1
CPU MHz: 2596.992
BogoMIPS: 5193.98
Hypervisor vendor: VMware
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 35840K
NUMA node0 CPU(s): 0-3
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc aperfmperf pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm 3dnowprefetch epb cat_l3 cdp_l3 fsgsbase smep cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts
[root@master ~]#
nodeone.rkdomain.test has following CPU details:
[root@nodeone ~]# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 1
On-line CPU(s) list: 0
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 79
Model name: Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
Stepping: 1
CPU MHz: 2596.992
BogoMIPS: 5193.98
Hypervisor vendor: VMware
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 35840K
NUMA node0 CPU(s): 0
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc aperfmperf pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm 3dnowprefetch epb cat_l3 cdp_l3 fsgsbase smep cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts
[root@nodeone ~]#
Which machine is short of CPUs ?
I have followed this link: https://docs.openshift.com/container-platform/3.10/install/prerequisites.html
Have I missed out anything to consider ?
It would be your infra node per: https://github.com/openshift/openshift-ansible/blob/a7bacc1ebf3da623cadee79faba5ee6e0ed71eb6/roles/openshift_management/defaults/main.yml#L9
Bumping up the vCPU count on your infra node will allow the pod to be scheduled
I have increased 1 more vCPU on my infranode and waiting for result.
Will keep you posted @nagonzalez
It failed again (I could see still some CPU issues):
oc get events -n openshift-management
LAST SEEN FIRST SEEN COUNT NAME KIND SUBOBJECT TYPE REASON SOURCE MESSAGE
20m 20m 1 ansible-1-deploy.1557b6af9d81cf83 Pod Normal Scheduled default-scheduler Successfully assigned ansible-1-deploy to infranode.rkdomain.test
20m 20m 1 ansible-1-deploy.1557b6b034920735 Pod spec.containers{deployment} Normal Pulled kubelet, infranode.rkdomain.test Container image "registry.access.redhat.com/openshift3/ose-deployer:v3.10.45" already present on machine
20m 20m 1 ansible-1-deploy.1557b6b0390c8635 Pod spec.containers{deployment} Normal Created kubelet, infranode.rkdomain.test Created container
20m 20m 1 ansible-1-deploy.1557b6b047302290 Pod spec.containers{deployment} Normal Started kubelet, infranode.rkdomain.test Started container
20m 20m 1 ansible.1557b6af9b990c83 DeploymentConfig Normal DeploymentCreated deploymentconfig-controller Created new replication controller "ansible-1" for version 1
32m 8h 1729 cloudforms-0.15579a7b6d2e2f7b Pod Warning FailedScheduling default-scheduler 0/3 nodes are available: 2 Insufficient cpu, 2 node(s) didn't match node selector.
27m 29m 9 cloudforms-0.1557b63c30478450 Pod Warning FailedScheduling default-scheduler 0/3 nodes are available: 1 Insufficient cpu, 1 node(s) were not ready, 1 node(s) were out of disk space, 2 node(s) didn't match node selector.
23m 23m 1 cloudforms-0.1557b684e0a4c8fc Pod spec.containers{cloudforms} Normal Pulling kubelet, infranode.rkdomain.test pulling image "registry.access.redhat.com/cloudforms46/cfme-openshift-app-ui:latest"
20m 20m 1 cloudforms-0.1557b6ade8780998 Pod Normal Scheduled default-scheduler Successfully assigned cloudforms-0 to infranode.rkdomain.test
20m 20m 1 cloudforms-0.1557b6ae63451127 Pod spec.containers{cloudforms} Normal Pulling kubelet, infranode.rkdomain.test pulling image "registry.access.redhat.com/cloudforms46/cfme-openshift-app-ui:latest"
17m 17m 1 cloudforms-0.1557b6df5cac4594 Pod spec.containers{cloudforms} Normal Pulled kubelet, infranode.rkdomain.test Successfully pulled image "registry.access.redhat.com/cloudforms46/cfme-openshift-app-ui:latest"
17m 17m 1 cloudforms-0.1557b6df5cc8e2d2 Pod spec.containers{cloudforms} Warning Failed kubelet, infranode.rkdomain.test Error: cannot find volume "cloudforms-server" to mount into container "cloudforms"
17m 17m 1 cloudforms-0.1557b6e2d8fd1c2c Pod spec.containers{cloudforms} Normal Pulled kubelet, infranode.rkdomain.test Successfully pulled image "registry.access.redhat.com/cloudforms46/cfme-openshift-app-ui:latest"
17m 17m 1 cloudforms-0.1557b6e2e3f7e8f4 Pod spec.containers{cloudforms} Normal Created kubelet, infranode.rkdomain.test Created container
17m 17m 1 cloudforms-0.1557b6e2f5ba3930 Pod spec.containers{cloudforms} Normal Started kubelet, infranode.rkdomain.test Started container
47s 13m 58 cloudforms-0.1557b711d3650d2d Pod spec.containers{cloudforms} Warning Unhealthy kubelet, infranode.rkdomain.test Readiness probe failed: dial tcp 10.129.0.78:80: getsockopt: connection refused
2m 8h 2212 cloudforms-postgresql.15579a7bdb6e08a3 PersistentVolumeClaim Normal FailedBinding persistentvolume-controller no persistent volumes available for this claim and no storage class is set
20m 20m 1 cloudforms.1557b6ad8da39e77 StatefulSet Normal SuccessfulDelete statefulset-controller delete Pod cloudforms-0 in StatefulSet cloudforms successful
20m 20m 1 cloudforms.1557b6ade80d30a3 StatefulSet Normal SuccessfulCreate statefulset-controller create Pod cloudforms-0 in StatefulSet cloudforms successful
20m 20m 1 httpd-1-deploy.1557b6b042732a82 Pod Normal Scheduled default-scheduler Successfully assigned httpd-1-deploy to infranode.rkdomain.test
20m 20m 1 httpd-1-deploy.1557b6b0ee3623a4 Pod spec.containers{deployment} Normal Pulled kubelet, infranode.rkdomain.test Container image "registry.access.redhat.com/openshift3/ose-deployer:v3.10.45" already present on machine
20m 20m 1 httpd-1-deploy.1557b6b0f0adbce8 Pod spec.containers{deployment} Normal Created kubelet, infranode.rkdomain.test Created container
20m 20m 1 httpd-1-deploy.1557b6b0fee77828 Pod spec.containers{deployment} Normal Started kubelet, infranode.rkdomain.test Started container
5m 20m 57 httpd-1-ldnzz.1557b6b11b1d935b Pod Warning FailedScheduling default-scheduler 0/3 nodes are available: 1 Insufficient cpu, 2 node(s) didn't match node selector.
40s 40s 1 httpd-1-ldnzz.1557b7c8ee994fd8 Pod Warning FailedScheduling default-scheduler skip schedule deleting pod: openshift-management/httpd-1-ldnzz
20m 20m 1 httpd-1.1557b6b11bbc927b ReplicationController Normal SuccessfulCreate replication-controller Created pod: httpd-1-ldnzz
40s 40s 1 httpd-1.1557b7c8eee53a24 ReplicationController Normal SuccessfulDelete replication-controller Deleted pod: httpd-1-ldnzz
20m 20m 1 httpd.1557b6b03f09ebd5 DeploymentConfig Normal DeploymentCreated deploymentconfig-controller Created new replication controller "httpd-1" for version 1
40s 40s 1 httpd.1557b7c8ec49da02 DeploymentConfig Normal ReplicationControllerScaled deploymentconfig-controller Scaled replication controller "httpd-1" from 1 to 0
20m 20m 1 memcached-1-deploy.1557b6aed7e46e00 Pod Normal Scheduled default-scheduler Successfully assigned memcached-1-deploy to infranode.rkdomain.test
20m 20m 1 memcached-1-deploy.1557b6af6122af60 Pod spec.containers{deployment} Normal Pulled kubelet, infranode.rkdomain.test Container image "registry.access.redhat.com/openshift3/ose-deployer:v3.10.45" already present on machine
20m 20m 1 memcached-1-deploy.1557b6af63f92a7d Pod spec.containers{deployment} Normal Created kubelet, infranode.rkdomain.test Created container
20m 20m 1 memcached-1-deploy.1557b6af73142b9e Pod spec.containers{deployment} Normal Started kubelet, infranode.rkdomain.test Started container
25m 25m 5 memcached-1-p4xlt.1557b66d4beafe6c Pod Warning FailedScheduling default-scheduler 0/3 nodes are available: 1 node(s) were not ready, 1 node(s) were out of disk space, 2 node(s) didn't match node selector.
25m 25m 3 memcached-1-p4xlt.1557b670cb332ba5 Pod Warning FailedScheduling default-scheduler 0/3 nodes are available: 1 node(s) didn't match node selector, 2 node(s) were not ready, 2 node(s) were out of disk space.
24m 24m 2 memcached-1-p4xlt.1557b67bf8f0996f Pod Warning FailedScheduling default-scheduler 0/3 nodes are available: 1 node(s) were not ready, 2 node(s) didn't match node selector.
23m 23m 1 memcached-1-p4xlt.1557b689f49cd8a7 Pod Normal Scheduled default-scheduler Successfully assigned memcached-1-p4xlt to infranode.rkdomain.test
23m 23m 1 memcached-1-p4xlt.1557b68adc7024d6 Pod spec.containers{memcached} Normal Pulling kubelet, infranode.rkdomain.test pulling image "registry.access.redhat.com/cloudforms46/cfme-openshift-memcached:latest"
17m 17m 1 memcached-1-p4xlt.1557b6e123736af4 Pod spec.containers{memcached} Normal Pulled kubelet, infranode.rkdomain.test Successfully pulled image "registry.access.redhat.com/cloudforms46/cfme-openshift-memcached:latest"
17m 17m 1 memcached-1-p4xlt.1557b6e12387bbc2 Pod spec.containers{memcached} Warning Failed kubelet, infranode.rkdomain.test Error: cannot find volume "default-token-2kjwl" to mount into container "memcached"
20m 20m 1 memcached-1-z4ljm.1557b6af9a36e0c7 Pod Normal Scheduled default-scheduler Successfully assigned memcached-1-z4ljm to infranode.rkdomain.test
20m 20m 1 memcached-1-z4ljm.1557b6b03bfc80b1 Pod spec.containers{memcached} Normal Pulling kubelet, infranode.rkdomain.test pulling image "registry.access.redhat.com/cloudforms46/cfme-openshift-memcached:latest"
17m 17m 1 memcached-1-z4ljm.1557b6e48f6a55f0 Pod spec.containers{memcached} Normal Pulled kubelet, infranode.rkdomain.test Successfully pulled image "registry.access.redhat.com/cloudforms46/cfme-openshift-memcached:latest"
17m 17m 1 memcached-1-z4ljm.1557b6e4976728b8 Pod spec.containers{memcached} Normal Created kubelet, infranode.rkdomain.test Created container
17m 17m 1 memcached-1-z4ljm.1557b6e4a3ff4268 Pod spec.containers{memcached} Normal Started kubelet, infranode.rkdomain.test Started container
25m 25m 1 memcached-1.1557b66d462f1e88 ReplicationController Normal SuccessfulCreate replication-controller Created pod: memcached-1-p4xlt
20m 20m 1 memcached-1.1557b6af990b1e86 ReplicationController Normal SuccessfulCreate replication-controller Created pod: memcached-1-z4ljm
20m 20m 1 memcached.1557b6aed2740566 DeploymentConfig Normal DeploymentCreated deploymentconfig-controller Created new replication controller "memcached-1" for version 1
14m 20m 26 postgresql-1-2x5tm.1557b6afecab52d5 Pod Warning FailedScheduling default-scheduler 0/3 nodes are available: 1 Insufficient cpu, 1 Insufficient memory, 2 node(s) didn't match node selector.
10m 10m 1 postgresql-1-2x5tm.1557b73bf6c04670 Pod Warning FailedScheduling default-scheduler skip schedule deleting pod: openshift-management/postgresql-1-2x5tm
20m 20m 1 postgresql-1-deploy.1557b6af37e69dbd Pod Normal Scheduled default-scheduler Successfully assigned postgresql-1-deploy to infranode.rkdomain.test
20m 20m 1 postgresql-1-deploy.1557b6afaac0461d Pod spec.containers{deployment} Normal Pulled kubelet, infranode.rkdomain.test Container image "registry.access.redhat.com/openshift3/ose-deployer:v3.10.45" already present on machine
20m 20m 1 postgresql-1-deploy.1557b6afafc5e25a Pod spec.containers{deployment} Normal Created kubelet, infranode.rkdomain.test Created container
20m 20m 1 postgresql-1-deploy.1557b6afc3520777 Pod spec.containers{deployment} Normal Started kubelet, infranode.rkdomain.test Started container
20m 20m 1 postgresql-1.1557b6afeca40714 ReplicationController Normal SuccessfulCreate replication-controller Created pod: postgresql-1-2x5tm
10m 10m 1 postgresql-1.1557b73bf7225452 ReplicationController Normal SuccessfulDelete replication-controller Deleted pod: postgresql-1-2x5tm
20m 20m 1 postgresql.1557b6af35f493e8 DeploymentConfig Normal DeploymentCreated deploymentconfig-controller Created new replication controller "postgresql-1" for version 1
10m 10m 1 postgresql.1557b73bf4cccce7 DeploymentConfig Normal ReplicationControllerScaled deploymentconfig-controller Scaled replication controller "postgresql-1" from 1 to 0
infranode.rkdomain.test CPU details:
[root@master ~]# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 4
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 79
Model name: Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
Stepping: 1
CPU MHz: 2596.992
BogoMIPS: 5193.98
Hypervisor vendor: VMware
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 35840K
NUMA node0 CPU(s): 0-3
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc aperfmperf pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm 3dnowprefetch epb cat_l3 cdp_l3 fsgsbase smep cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts
[root@master ~]#
oc get nodes
[root@master ~]# oc get nodes
NAME STATUS ROLES AGE VERSION
infranode.rkdomain.test Ready infra 1d v1.10.0+b81c8f8
master.rkdomain.test Ready master 1d v1.10.0+b81c8f8
nodeone.rkdomain.test Ready compute 1d v1.10.0+b81c8f8
[root@master ~]#
ansible-playbook -vvv openshift-management/config.yml
fatal: [master.rkdomain.test]: FAILED! => {
"attempts": 30,
"changed": true,
"cmd": [
"oc",
"logs",
"cloudforms-0",
"-n",
"openshift-management"
],
"delta": "0:00:00.517450",
"end": "2018-09-25 14:19:55.346904",
"failed": true,
"invocation": {
"module_args": {
"_raw_params": "oc logs cloudforms-0 -n openshift-management",
"_uses_shell": false,
"chdir": null,
"creates": null,
"executable": null,
"removes": null,
"stdin": null,
"warn": true
}
},
"rc": 0,
"start": "2018-09-25 14:19:54.829454",
"stderr": "",
"stderr_lines": [],
"stdout": "== Checking 172.30.247.208:11211 status ==\n172.30.247.208:11211 - accepting connections\n== Checking 172.30.117.206:5432 status ==\nNcat: No route to host.\nNcat: No route to host.\nNcat: No route to host.\nNcat: No route to host.\nNcat: No route to host.\nNcat: No route to host.\nNcat: No route to host.\nNcat: No route to host.\nNcat: No route to host.\nNcat: No route to host.\nNcat: No route to host.\nNcat: No route to host.\nNcat: No route to host.\nNcat: No route to host.\nNcat: No route to host.\nNcat: No route to host.\nNcat: No route to host.\nNcat: No route to host.\nNcat: No route to host.\nNcat: No route to host.\nNcat: No route to host.\nNcat: No route to host.\nNcat: No route to host.\nNcat: No route to host.",
"stdout_lines": [
"== Checking 172.30.247.208:11211 status ==",
"172.30.247.208:11211 - accepting connections",
"== Checking 172.30.117.206:5432 status ==",
"Ncat: No route to host.",
"Ncat: No route to host.",
"Ncat: No route to host.",
"Ncat: No route to host.",
"Ncat: No route to host.",
"Ncat: No route to host.",
"Ncat: No route to host.",
"Ncat: No route to host.",
"Ncat: No route to host.",
"Ncat: No route to host.",
"Ncat: No route to host.",
"Ncat: No route to host.",
"Ncat: No route to host.",
"Ncat: No route to host.",
"Ncat: No route to host.",
"Ncat: No route to host.",
"Ncat: No route to host.",
"Ncat: No route to host.",
"Ncat: No route to host.",
"Ncat: No route to host.",
"Ncat: No route to host.",
"Ncat: No route to host.",
"Ncat: No route to host.",
"Ncat: No route to host."
]
}
PLAY RECAP ********************************************************************************************************************************************************************************************************************************
infranode.rkdomain.test : ok=1 changed=0 unreachable=0 failed=0
localhost : ok=11 changed=0 unreachable=0 failed=0
master.rkdomain.test : ok=86 changed=1 unreachable=0 failed=1
nodeone.rkdomain.test : ok=1 changed=0 unreachable=0 failed=0
INSTALLER STATUS **************************************************************************************************************************************************************************************************************************
Initialization : Complete (0:00:42)
Management Install : In Progress (0:17:18)
This phase can be restarted by running: playbooks/openshift-management/config.yml
Tuesday 25 September 2018 14:19:55 -0400 (0:15:29.439) 0:18:00.007 *****
===============================================================================
openshift_management : Wait for the app to come up. May take several minutes, 30s check intervals, 30 retries -------------------------------------------------------------------------------------------------------------------- 929.44s
/usr/share/ansible/openshift-ansible/roles/openshift_management/tasks/main.yml:89 --------------------------------------------------------------------------------------------------------------------------------------------------------
openshift_management : Ensure the Management App is created ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- 30.57s
/usr/share/ansible/openshift-ansible/roles/openshift_management/tasks/main.yml:82 --------------------------------------------------------------------------------------------------------------------------------------------------------
openshift_nfs : Install nfs-utils ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 28.04s
/usr/share/ansible/openshift-ansible/roles/openshift_nfs/tasks/setup.yml:5 ---------------------------------------------------------------------------------------------------------------------------------------------------------------
Gathering Facts ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 17.07s
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
openshift_manageiq : Configure role/user permissions ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 9.62s
/usr/share/ansible/openshift-ansible/roles/openshift_manageiq/tasks/main.yaml:45 ---------------------------------------------------------------------------------------------------------------------------------------------------------
openshift_management : Ensure the CFME system accounts exist ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- 4.20s
/usr/share/ansible/openshift-ansible/roles/openshift_management/tasks/accounts.yml:4 -----------------------------------------------------------------------------------------------------------------------------------------------------
Gather Cluster facts --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 4.13s
/usr/share/ansible/openshift-ansible/playbooks/init/cluster_facts.yml:27 -----------------------------------------------------------------------------------------------------------------------------------------------------------------
openshift_management : Ensure the CFME system accounts have all the required SCCs -------------------------------------------------------------------------------------------------------------------------------------------------- 3.87s
/usr/share/ansible/openshift-ansible/roles/openshift_management/tasks/accounts.yml:12 ----------------------------------------------------------------------------------------------------------------------------------------------------
Initialize openshift.node.sdn_mtu -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 2.42s
/usr/share/ansible/openshift-ansible/playbooks/init/cluster_facts.yml:59 -----------------------------------------------------------------------------------------------------------------------------------------------------------------
openshift_management : Ensure the CFME system accounts have the required roles ----------------------------------------------------------------------------------------------------------------------------------------------------- 2.19s
/usr/share/ansible/openshift-ansible/roles/openshift_management/tasks/accounts.yml:21 ----------------------------------------------------------------------------------------------------------------------------------------------------
openshift_manageiq : Create Admin and Image Inspector Service Account -------------------------------------------------------------------------------------------------------------------------------------------------------------- 2.08s
/usr/share/ansible/openshift-ansible/roles/openshift_manageiq/tasks/main.yaml:7 ----------------------------------------------------------------------------------------------------------------------------------------------------------
get openshift_current_version ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 2.04s
/usr/share/ansible/openshift-ansible/playbooks/init/cluster_facts.yml:10 -----------------------------------------------------------------------------------------------------------------------------------------------------------------
Setting sebool container_manage_cgroup --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 2.04s
/usr/share/ansible/openshift-ansible/playbooks/openshift-management/private/config-sebool.yml:7 ------------------------------------------------------------------------------------------------------------------------------------------
openshift_management : Check if the Management Server template has been created already -------------------------------------------------------------------------------------------------------------------------------------------- 1.90s
/usr/share/ansible/openshift-ansible/roles/openshift_management/tasks/template.yml:18 ----------------------------------------------------------------------------------------------------------------------------------------------------
Run variable sanity checks --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 1.67s
/usr/share/ansible/openshift-ansible/playbooks/init/sanity_checks.yml:14 -----------------------------------------------------------------------------------------------------------------------------------------------------------------
openshift_management : Check if the Management DB PV template has been created already --------------------------------------------------------------------------------------------------------------------------------------------- 1.53s
/usr/share/ansible/openshift-ansible/roles/openshift_management/tasks/template.yml:106 ---------------------------------------------------------------------------------------------------------------------------------------------------
openshift_manageiq : Add Management Infrastructure project ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 1.39s
/usr/share/ansible/openshift-ansible/roles/openshift_manageiq/tasks/main.yaml:2 ----------------------------------------------------------------------------------------------------------------------------------------------------------
openshift_manageiq : Create manageiq cluster role ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 1.32s
/usr/share/ansible/openshift-ansible/roles/openshift_manageiq/tasks/main.yaml:16 ---------------------------------------------------------------------------------------------------------------------------------------------------------
openshift_manageiq : Configure 3_2 role/user permissions --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 1.29s
/usr/share/ansible/openshift-ansible/roles/openshift_manageiq/tasks/main.yaml:53 ---------------------------------------------------------------------------------------------------------------------------------------------------------
openshift_nfs : Enable and start NFS services -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 1.08s
/usr/share/ansible/openshift-ansible/roles/openshift_nfs/tasks/setup.yml:26 --------------------------------------------------------------------------------------------------------------------------------------------------------------
[root@master playbooks]#
After struggling with this for days I finally found the reason (for my setup)
When running multiple nodes but placing etcd on the same node as the master, the master-container cannot access the etcd-container because the DNS lookup within the container resolves to 127.0.0.1 whereas it must use the external ip (in my case 192.168.60.150)
So there seems to be bug in the installer, or the dns. After I've extracted the etcd to a dedicated node, the lookup uses the given IP, and the master-container can talk to its etcd.
Here are the documented requirements:
https://github.com/openshift/openshift-ansible/tree/release-3.10/roles/openshift_management
you're going to have to increase resources until the CPU & MEM requests can be fulfilled.
@nagonzalez for the time being I have disabled "openshift_management" component.
Now I can login to OpenShift Web Console. But next step is also not smooth.
Now I am unable to deploy any app those are listed in Catalog. See the error in attached screen shot. It says ontains unresolved images -

And: oc status
[root@master playbooks]# oc status
In project rk project for demo (rkproject) on server https://master.rkdomain.test:8443
dc/php deploys openshift/php:7.0
deployment #1 waiting on image
2 infos identified, use 'oc status -v' to see details.
[root@master playbooks]#
I'm guessing the php images aren't in your internal docker registry. What's the output of?
oc get imagestreams -n openshift
oc get imagestream.image.openshift.io/php -n openshift
Here are those
[root@master playbooks]# oc get imagestreams -n openshift
NAME DOCKER REPO TAGS UPDATED
dotnet docker-registry.default.svc:5000/openshift/dotnet 1.0,1.1,2.0 + 1 more...
dotnet-runtime docker-registry.default.svc:5000/openshift/dotnet-runtime 2.1,2.0
fis-java-openshift docker-registry.default.svc:5000/openshift/fis-java-openshift 1.0,2.0
fis-karaf-openshift docker-registry.default.svc:5000/openshift/fis-karaf-openshift 2.0,1.0
httpd docker-registry.default.svc:5000/openshift/httpd 2.4
java docker-registry.default.svc:5000/openshift/java 8
jboss-amq-62 docker-registry.default.svc:5000/openshift/jboss-amq-62 1.6,1.7,1.1 + 4 more...
jboss-amq-63 docker-registry.default.svc:5000/openshift/jboss-amq-63 1.3,1.0,1.1 + 1 more...
jboss-datagrid65-client-openshift docker-registry.default.svc:5000/openshift/jboss-datagrid65-client-openshift 1.0,1.1
jboss-datagrid65-openshift docker-registry.default.svc:5000/openshift/jboss-datagrid65-openshift 1.6,1.2,1.3 + 2 more...
jboss-datagrid71-client-openshift docker-registry.default.svc:5000/openshift/jboss-datagrid71-client-openshift 1.0
jboss-datagrid71-openshift docker-registry.default.svc:5000/openshift/jboss-datagrid71-openshift 1.0,1.1,1.2 + 1 more...
jboss-datagrid72-openshift docker-registry.default.svc:5000/openshift/jboss-datagrid72-openshift 1.0
jboss-datavirt63-driver-openshift docker-registry.default.svc:5000/openshift/jboss-datavirt63-driver-openshift 1.0,1.1
jboss-datavirt63-openshift docker-registry.default.svc:5000/openshift/jboss-datavirt63-openshift 1.4,1.0,1.1 + 2 more...
jboss-decisionserver62-openshift docker-registry.default.svc:5000/openshift/jboss-decisionserver62-openshift 1.2
jboss-decisionserver63-openshift docker-registry.default.svc:5000/openshift/jboss-decisionserver63-openshift 1.3,1.4
jboss-decisionserver64-openshift docker-registry.default.svc:5000/openshift/jboss-decisionserver64-openshift 1.0,1.1,1.2 + 1 more...
jboss-eap64-openshift docker-registry.default.svc:5000/openshift/jboss-eap64-openshift 1.1,1.3,1.4 + 6 more...
jboss-eap70-openshift docker-registry.default.svc:5000/openshift/jboss-eap70-openshift 1.3,1.4,1.5 + 2 more...
jboss-eap71-openshift docker-registry.default.svc:5000/openshift/jboss-eap71-openshift 1.1,1.2,1.3 + 1 more...
jboss-fuse70-console docker-registry.default.svc:5000/openshift/jboss-fuse70-console 1.0
jboss-fuse70-eap-openshift docker-registry.default.svc:5000/openshift/jboss-fuse70-eap-openshift 1.0
jboss-fuse70-java-openshift docker-registry.default.svc:5000/openshift/jboss-fuse70-java-openshift 1.0
jboss-fuse70-karaf-openshift docker-registry.default.svc:5000/openshift/jboss-fuse70-karaf-openshift 1.0
jboss-processserver63-openshift docker-registry.default.svc:5000/openshift/jboss-processserver63-openshift 1.3,1.4
jboss-processserver64-openshift docker-registry.default.svc:5000/openshift/jboss-processserver64-openshift 1.3,1.0,1.1 + 1 more...
jboss-webserver30-tomcat7-openshift docker-registry.default.svc:5000/openshift/jboss-webserver30-tomcat7-openshift 1.2,1.3,1.1
jboss-webserver30-tomcat8-openshift docker-registry.default.svc:5000/openshift/jboss-webserver30-tomcat8-openshift 1.1,1.2,1.3
jboss-webserver31-tomcat7-openshift docker-registry.default.svc:5000/openshift/jboss-webserver31-tomcat7-openshift 1.0,1.1,1.2
jboss-webserver31-tomcat8-openshift docker-registry.default.svc:5000/openshift/jboss-webserver31-tomcat8-openshift 1.0,1.1,1.2
jenkins docker-registry.default.svc:5000/openshift/jenkins 1,2
mariadb docker-registry.default.svc:5000/openshift/mariadb 10.1,10.2
mongodb docker-registry.default.svc:5000/openshift/mongodb 3.2,3.4,2.4 + 1 more...
mysql docker-registry.default.svc:5000/openshift/mysql 5.5,5.6,5.7
nginx docker-registry.default.svc:5000/openshift/nginx 1.12,1.8,1.10
nodejs docker-registry.default.svc:5000/openshift/nodejs 8-RHOAR,0.10,4 + 2 more...
perl docker-registry.default.svc:5000/openshift/perl 5.24,5.16,5.20
php docker-registry.default.svc:5000/openshift/php 7.0,7.1,5.5 + 1 more...
postgresql docker-registry.default.svc:5000/openshift/postgresql 9.2,9.4,9.5 + 1 more...
python docker-registry.default.svc:5000/openshift/python 2.7,3.3,3.4 + 2 more...
redhat-openjdk18-openshift docker-registry.default.svc:5000/openshift/redhat-openjdk18-openshift 1.3,1.4,1.0 + 2 more...
redhat-sso70-openshift docker-registry.default.svc:5000/openshift/redhat-sso70-openshift 1.3,1.4
redhat-sso71-openshift docker-registry.default.svc:5000/openshift/redhat-sso71-openshift 1.0,1.1,1.2 + 1 more...
redhat-sso72-openshift docker-registry.default.svc:5000/openshift/redhat-sso72-openshift 1.0,1.1,1.2
redis docker-registry.default.svc:5000/openshift/redis 3.2
rhdm70-decisioncentral-openshift docker-registry.default.svc:5000/openshift/rhdm70-decisioncentral-openshift 1.0
rhdm70-kieserver-openshift docker-registry.default.svc:5000/openshift/rhdm70-kieserver-openshift 1.0
rhpam70-businesscentral-indexing-openshift docker-registry.default.svc:5000/openshift/rhpam70-businesscentral-indexing-openshift 1.0
rhpam70-businesscentral-monitoring-openshift docker-registry.default.svc:5000/openshift/rhpam70-businesscentral-monitoring-openshift 1.0
rhpam70-businesscentral-openshift docker-registry.default.svc:5000/openshift/rhpam70-businesscentral-openshift 1.0
rhpam70-controller-openshift docker-registry.default.svc:5000/openshift/rhpam70-controller-openshift 1.0
rhpam70-kieserver-openshift docker-registry.default.svc:5000/openshift/rhpam70-kieserver-openshift 1.0
rhpam70-smartrouter-openshift docker-registry.default.svc:5000/openshift/rhpam70-smartrouter-openshift 1.0
ruby docker-registry.default.svc:5000/openshift/ruby 2.3,2.4,2.5 + 2 more...
[root@master playbooks]#
[root@master playbooks]# oc get imagestream.image.openshift.io/php -n openshift
NAME DOCKER REPO TAGS UPDATED
php docker-registry.default.svc:5000/openshift/php 5.5,5.6,7.0 + 1 more...
[root@master playbooks]#
@raj4linux is hard to chase everyone but we keep repeating that an open issue should not be hijacked by other issues...
initially you reported control plane pods are not coming up then a solution was provided by @nagonzalez and then the conv moved into different issue.
Quite frankly is hard and time consuming to chase a github issue which has mixed sub-issues .
out of 10 random issues, 8 are with _oh, i have this issue too, and this, and that_ you lose track what was the initial prob.
Please don't take it personally, i'm only trying to bring some sanity around,, hopefully with more discipline from you and others we'll succeed together.
@DanyC97 You are right. I should have not mixed 2 issues. My apology for the same.
In future I will take care of things.
Most helpful comment
@raj4linux is hard to chase everyone but we keep repeating that an open issue should not be hijacked by other issues...
initially you reported control plane pods are not coming up then a solution was provided by @nagonzalez and then the conv moved into different issue.
Quite frankly is hard and time consuming to chase a github issue which has mixed sub-issues .
out of 10 random issues, 8 are with _oh, i have this issue too, and this, and that_ you lose track what was the initial prob.
Please don't take it personally, i'm only trying to bring some sanity around,, hopefully with more discipline from you and others we'll succeed together.