I've installed Openshift Origin 3.9 OK with ansible in 4 nodes ( 1 master + 3 nodes with glusterfs , also with register in gluster) with CentOS 7.4.1708 (Core)
Here the inventory.ini file.
````ini
[OSEv3:children]
masters
nodes
etcd
glusterfs
glusterfs_registry
[masters]
192.168.1.111 openshift_ip=192.168.1.111 openshift_schedulable=true
[etcd]
192.168.1.111 openshift_ip=192.168.1.111
[nodes]
192.168.1.111 openshift_ip=192.168.1.111 openshift_schedulable=true openshift_node_labels="{'region': 'infra', 'zone': 'default', 'nodetype': 'default'}"
192.168.1.115 openshift_ip=192.168.1.115 openshift_schedulable=true openshift_node_labels="{'region': 'infra', 'zone': 'default', 'nodetype': 'storage'}"
192.168.1.116 openshift_ip=192.168.1.116 openshift_schedulable=true openshift_node_labels="{'region': 'infra', 'zone': 'default', 'nodetype': 'storage'}"
192.168.1.117 openshift_ip=192.168.1.117 openshift_schedulable=true openshift_node_labels="{'region': 'infra', 'zone': 'default', 'nodetype': 'storage'}"
[glusterfs]
192.168.1.115 glusterfs_ip=192.168.1.115 glusterfs_devices='[ "/dev/sdb" ]'
192.168.1.116 glusterfs_ip=192.168.1.116 glusterfs_devices='[ "/dev/sdb" ]'
192.168.1.117 glusterfs_ip=192.168.1.117 glusterfs_devices='[ "/dev/sdb" ]'
[glusterfs_registry]
192.168.1.115 glusterfs_ip=192.168.1.115 glusterfs_devices='[ "/dev/sdb" ]'
192.168.1.116 glusterfs_ip=192.168.1.116 glusterfs_devices='[ "/dev/sdb" ]'
192.168.1.117 glusterfs_ip=192.168.1.117 glusterfs_devices='[ "/dev/sdb" ]'
[OSEv3:vars]
ansible_ssh_user=root
enable_excluders=False
enable_docker_excluder=False
ansible_service_broker_install=False
containerized=True
os_sdn_network_plugin_name='redhat/openshift-ovs-multitenant'
openshift_disable_check=disk_availability,docker_storage,memory_availability,docker_image_availability
openshift_node_kubelet_args={'pods-per-core': ['50']}
deployment_type=origin
openshift_deployment_type=origin
openshift_release=v3.9
openshift_pkg_version=v3.9
openshift_image_tag=v3.9.0
openshift_service_catalog_image_version=v3.9
template_service_broker_image_version=v3.9
openshift_repos_enable_testing=true
osm_use_cockpit=true
openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider', 'filename': '/etc/origin/master/htpasswd'}]
openshift_public_hostname=console.mydomain.org
openshift_master_default_subdomain=apps.mydomain.org
openshift_storage_glusterfs_namespace=glusterfs
openshift_storage_glusterfs_name=storage
openshift_hosted_registry_storage_kind=glusterfs
openshift_hosted_registry_replicas=3
openshift_hosted_registry_storage_volume_size=5Gi
````
And installed with these two commands.
cmd
git clone https://github.com/openshift/openshift-ansible.git
cd openshift-ansible
git fetch
git checkout release-3.9
cd ..
ansible-playbook -i inventory.ini openshift-ansible/playbooks/prerequisites.yml
ansible-playbook -i inventory.ini openshift-ansible/playbooks/deploy_cluster.yml
With this result

Then I've installed something to test ( kube-ops-view by example)
bash
oc new-project ocp-ops-view
oc create sa kube-ops-view
oc adm policy add-scc-to-user anyuid -z kube-ops-view
oc adm policy add-cluster-role-to-user cluster-admin system:serviceaccount:ocp-ops-view:kube-ops-view
oc apply -f https://raw.githubusercontent.com/raffaelespazzoli/kube-ops-view/ocp/deploy-openshift/kube-ops-view.yaml
oc expose svc kube-ops-view
With this result

````
[root@openshift01 installcentos]# oc version
oc v3.9.0+4814d4a-3
kubernetes v1.9.1+a0ce1bc657
features: Basic-Auth GSSAPI Kerberos SPNEGO
Server https://openshift01:8443
openshift v3.9.0+4814d4a-3
kubernetes v1.9.1+a0ce1bc657
````
Install 4 nodes with previous inventory file and exectute ansible
I can not use the cluster
Could you post oc describe pod/pod-... of some of the pods that failed scheduling? I see partial error 0/4 nodes are available: 4 MatchNodeSelector. Sounds like node labeling issue, where some nodes should be marked as compute nodes. IIRC in 3.9 default node selector has been introduced and not sure whether this is well documented or not.
Hi @akostadinov here the output for the 2 pending pods
````bash
[root@openshift01 localvolumes]# oc get all
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
deploy/kube-ops-view 1 1 1 0 11h
deploy/kube-ops-view-redis 1 1 1 0 11h
NAME DESIRED CURRENT READY AGE
rs/kube-ops-view-758bf655f4 1 1 0 11h
rs/kube-ops-view-redis-7cd4b9cccc 1 1 0 11h
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
routes/kube-ops-view kube-ops-view-ocp-ops-view.apps.tonimoreno.org kube-ops-view 8080 None
NAME READY STATUS RESTARTS AGE
po/kube-ops-view-758bf655f4-9s8vf 0/1 Pending 0 11h
po/kube-ops-view-redis-7cd4b9cccc-4gp8l 0/1 Pending 0 11h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
svc/kube-ops-view ClusterIP 172.30.48.133
svc/kube-ops-view-redis ClusterIP 172.30.22.19
[root@openshift01 localvolumes]# oc describe po/kube-ops-view-758bf655f4-9s8vf
Name: kube-ops-view-758bf655f4-9s8vf
Namespace: ocp-ops-view
Node:
Labels: application=kube-ops-view
pod-template-hash=3146921190
version=v0.0.1
Annotations: openshift.io/scc=anyuid
Status: Pending
IP:
Controlled By: ReplicaSet/kube-ops-view-758bf655f4
Containers:
service:
Image: raffaelespazzoli/ocp-ops-view:latest
Port: 8080/TCP
Args:
--redis-url=redis://kube-ops-view-redis:6379
Limits:
cpu: 200m
memory: 100Mi
Requests:
cpu: 50m
memory: 50Mi
Readiness: http-get http://:8080/health delay=5s timeout=1s period=10s #success=1 #failure=3
Environment:
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-ops-view-token-kp5c2 (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
kube-ops-view-token-kp5c2:
Type: Secret (a volume populated by a Secret)
SecretName: kube-ops-view-token-kp5c2
Optional: false
QoS Class: Burstable
Node-Selectors: node-role.kubernetes.io/compute=true
Tolerations: node.kubernetes.io/memory-pressure:NoSchedule
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 8m (x1512 over 7h) default-scheduler 0/4 nodes are available: 2 NodeNotReady, 2 NodeOutOfDisk, 4 MatchNodeSelector.
Warning FailedScheduling 3m (x377 over 10h) default-scheduler 0/4 nodes are available: 1 NodeNotReady, 1 NodeOutOfDisk, 4 MatchNodeSelector.
[root@openshift01 localvolumes]# oc describe po/kube-ops-view-redis-7cd4b9cccc-4gp8l
Name: kube-ops-view-redis-7cd4b9cccc-4gp8l
Namespace: ocp-ops-view
Node:
Labels: application=kube-ops-view-redis
pod-template-hash=3780657777
version=v0.0.1
Annotations: openshift.io/scc=anyuid
Status: Pending
IP:
Controlled By: ReplicaSet/kube-ops-view-redis-7cd4b9cccc
Containers:
redis:
Image: redis:3.2-alpine
Port: 6379/TCP
Limits:
cpu: 200m
memory: 100Mi
Requests:
cpu: 50m
memory: 50Mi
Readiness: tcp-socket :6379 delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-ops-view-token-kp5c2 (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
kube-ops-view-token-kp5c2:
Type: Secret (a volume populated by a Secret)
SecretName: kube-ops-view-token-kp5c2
Optional: false
QoS Class: Burstable
Node-Selectors: node-role.kubernetes.io/compute=true
Tolerations: node.kubernetes.io/memory-pressure:NoSchedule
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 8m (x1512 over 7h) default-scheduler 0/4 nodes are available: 2 NodeNotReady, 2 NodeOutOfDisk, 4 MatchNodeSelector.
Warning FailedScheduling 3m (x377 over 10h) default-scheduler 0/4 nodes are available: 1 NodeNotReady, 1 NodeOutOfDisk, 4 MatchNodeSelector.
````
here the node info, just now.
[root@openshift01 localvolumes]# oc get nodes
NAME STATUS ROLES AGE VERSION
openshift01 Ready master 12h v1.9.1+a0ce1bc657
openshift05 Ready <none> 11h v1.9.1+a0ce1bc657
openshift06 Ready <none> 11h v1.9.1+a0ce1bc657
openshift07 Ready <none> 11h v1.9.1+a0ce1bc657
Ok, so in pods you can see Node-Selectors: node-role.kubernetes.io/compute=true. You need to have this label set for the nodes that you want to use for compute. i.e. oc edit node openshift05 and insert this label node-role.kubernetes.io/compute: "true". That should make your pods scheduled.
Another approach would be to remove default node selector from master-config.yml. But I prefer setting the necessary labels.
P.S. I'm somehow worried about 1 NodeNotReady, 1 NodeOutOfDisk. oc get nodes doesn't show anything, but it is worth checking oc describe nodes as well looking an node logs to check for anything bad.
Hi @akostadinov , If I've edited nodes and added this new label node-role.kubernetes.io/compute: "true", now everything working fine!!!!
````
[root@openshift01 installcentos]# oc get all
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
deploy/kube-ops-view 1 1 1 1 4m
deploy/kube-ops-view-redis 1 1 1 1 4m
NAME DESIRED CURRENT READY AGE
rs/kube-ops-view-758bf655f4 1 1 1 4m
rs/kube-ops-view-redis-7cd4b9cccc 1 1 1 4m
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
routes/kube-ops-view kube-ops-view-ocp-ops-view.apps.myorg.org kube-ops-view 8080 None
NAME READY STATUS RESTARTS AGE
po/kube-ops-view-758bf655f4-9sff8 1/1 Running 0 4m
po/kube-ops-view-redis-7cd4b9cccc-7c9l5 1/1 Running 0 4m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
svc/kube-ops-view ClusterIP 172.30.58.199
svc/kube-ops-view-redis ClusterIP 172.30.135.103
````

Somewhere in the origin doc should be specified that this is a mandatory label to get things running
Don't worry about NodeNotReady , sometimes I do node crash simulations ( to testing service recovery )
Thank you a lot, now I'm going to get back to the local volume issue
@akostadinov sorry for this question... but while editing /etc/origin/node/node-config.yaml I've seen that I have not the new label inside the node config
````yaml
....
kind: NodeConfig
kubeletArguments:
node-labels:
....
````
But it has in online.
[root@openshift01 installcentos]# oc describe node openshift01
Name: openshift01
Roles: compute,master
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/hostname=openshift01
node-role.kubernetes.io/compute=true
node-role.kubernetes.io/master=true
nodetype=default
region=infra
zone=default
Should I suppose that I will lost the label on next node restart?
wrt default labeling , this is documented in quick install quide, actually pointing at release notes.
I am wondering whether you followed the guide and the end result is missing the labels. In such case there needs to be something fixed.
wrt your question, I believe that restarting the node shouldn't lose your label change. But if you remove the node from API oc delete node mynode, then start the node, the node will be recreated without your modifications. I don't think oc edit is supposed to change node-config.yml. But if you believe this is necessary, feel free to file a RFE so we see what is team's opinion about it.
Ok, so in pods you can see
Node-Selectors: node-role.kubernetes.io/compute=true. You need to have this label set for the nodes that you want to use for compute. i.e.oc edit node openshift05and insert this labelnode-role.kubernetes.io/compute: "true". That should make your pods scheduled.Another approach would be to remove default node selector from
master-config.yml. But I prefer setting the necessary labels.P.S. I'm somehow worried about
1 NodeNotReady, 1 NodeOutOfDisk.oc get nodesdoesn't show anything, but it is worth checkingoc describe nodesas well looking an node logs to check for anything bad.
Life SAVER
Most helpful comment
Ok, so in pods you can see
Node-Selectors: node-role.kubernetes.io/compute=true. You need to have this label set for the nodes that you want to use for compute. i.e.oc edit node openshift05and insert this labelnode-role.kubernetes.io/compute: "true". That should make your pods scheduled.Another approach would be to remove default node selector from
master-config.yml. But I prefer setting the necessary labels.P.S. I'm somehow worried about
1 NodeNotReady, 1 NodeOutOfDisk.oc get nodesdoesn't show anything, but it is worth checkingoc describe nodesas well looking an node logs to check for anything bad.