On a single master, i am getting below error while installing through ansible
ansible --versionIf you're operating from a git clone:
git describeDescribe what you expected to happen.
OpenShift cluster should get install
Describe what is actually happening.
FAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (2 retries left).Result was: {
"attempts": 29,
"changed": true,
"cmd": [
"oc",
"get",
"crd",
"servicemonitors.monitoring.coreos.com",
"-n",
"openshift-monitoring",
"--config=/tmp/openshift-cluster-monitoring-ansible-AOOcA3/admin.kubeconfig"
],
"delta": "0:00:00.251324",
"end": "2018-10-17 12:43:50.411317",
"invocation": {
"module_args": {
"_raw_params": "oc get crd servicemonitors.monitoring.coreos.com -n openshift-monitoring --config=/tmp/openshift-cluster-monitoring-ansible-AOOcA3/admin.kubeconfig",
"_uses_shell": false,
"argv": null,
"chdir": null,
"creates": null,
"executable": null,
"removes": null,
"stdin": null,
"warn": true
}
},
"msg": "non-zero return code",
"rc": 1,
"retries": 31,
"start": "2018-10-17 12:43:50.159993",
"stderr": "No resources found.\nError from server (NotFound): customresourcedefinitions.apiextensions.k8s.io \"servicemonitors.monitoring.coreos.com\" not found",
"stderr_lines": [
"No resources found.",
"Error from server (NotFound): customresourcedefinitions.apiextensions.k8s.io \"servicemonitors.monitoring.coreos.com\" not found"
],
"stdout": "",
"stdout_lines": []
}
Provide any additional information which may help us diagnose the
issue.
Your operating system and version, ie: RHEL 7.2, Fedora 23 ($ cat /etc/redhat-release)
CentOS Linux release 7.5.1804 (Core)
Your inventory file (especially any non-standard configuration parameters)
[OSEv3:children]
masters
nodes
etcd
[OSEv3:vars]
ansible_ssh_user=root
openshift_deployment_type=origin
# localhost likely doesn't meet the minimum requirements
openshift_disable_check=disk_availability,memory_availability
openshift_additional_repos=[{'id': 'centos-okd-ci', 'name': 'centos-okd-ci', 'baseurl' :'https://rpms.svc.ci.openshift.org/openshift-origin-v3.11', 'gpgcheck' :'0', 'enabled' :'1'}
]
openshift_public_hostname=console.10.0.2.15.nip.io
openshift_master_default_subdomain=apps.10.0.2.15.nip.io
openshift_master_api_port=8443
openshift_master_console_port=8443
[masters]
c1-ocp openshift_ip=10.0.2.15 openshift_schedulable=true
[etcd]
c1-ocp openshift_ip=10.0.2.15
[nodes]
c1-ocp openshift_ip=10.0.2.15 openshift_schedulable=true openshift_node_group_name="node-config-all-in-one"
EXTRA INFORMATION GOES HERE
output of tail -f /var/log/messages is below
https://gist.github.com/imranrazakhan/fa69035bdad111a27dc354e2fc44ec50
@ Try restarting the Docker service. sudo systemctl restart docker
@ThilinaManamgoda I am having the same issues, restarting the Docker service on all the nodes didn't fix it.
Same issue here.. :(
Same issue here. Also with a 1-node setup.
Found the fix
File /etc/sysconfig/network-scripts/ifcfg-eth0 (CentOS)
There is a flag NM_CONTROLLED=no
Set it to yes, reboot the system and run the Ansible script again. Fixed the problem for me.
I got the same issue and, unfortunately, the fix with excluding eth0 interface from NM didn't work!
Found the fix
File
/etc/sysconfig/network-scripts/ifcfg-eth0(CentOS)
There is a flagNM_CONTROLLED=noSet it to
yes, reboot the system and run the Ansible script again. Fixed the problem for me.
@mithandir, are you sure that it was a problem? By default and even without that option you mentioned, NM takes control all available interfaces within the system!
So, I fixed the issue by reinstalling dnsmasq rpm package and cleaning its config files completely.
To be sure, that the trouble is same like at me, check the existence of /etc/cni/net.d/80-openshift-network.conf file. It's generated by openshift-ansible while installation!
Same issue here!
Any help?
Using rhel 7 with openshift-enterprise.
So, I fixed the issue by reinstalling dnsmasq rpm package and cleaning its config files completely.
To be sure, that the trouble is same like at me, check the existence of /etc/cni/net.d/80-openshift-network.conf file. It's generated by openshift-ansible while installation!
@dmnord, I have the similar problem as yours and tried the NM_CONTROLLED=yes and it did not fix any issue. I tried the yum uninstall and reinstall the dnsmasq but still doesnt fix it. Can you help me with a little more details related to the config files and what needs to be checked. I am trying the Ansible playbooks on my local with a Single node.
Same problem for me, adding NM_CONTROLLED=yes to ifcfg-eth0 and rebooting did not help.
Same issue here!
Any help?
Using rhel 7.5 with openshift-origin.
Same issue here!
Using centos 7.6 with openshift-origin.
issue is still present in openshift-origin 3.11 under Centos 7.5
I have same issue
in my case internal DNS was not configured. Manually create /etc/origin/node/resolv.conf and /etc/dnsmasq.d/origin-upstream-dns.conf (i.e. with the contents of /etc/dnsmasq.d/ose-dns).
I have the same issue, but what I did was....
I searched what is in bold, and I find a next solution.....
Red Hat suggest delete extra files and only keep 80-openshift-network.conf, but I only move 100-crio-bridge.conf and 200-loopback.conf to other directory. After do that, I reboot all my nodes, and in master node I execute playbooks/openshift-monitoring/config.yml again and it worked.
Thanks @ArturoArreola
Deleting everything in /etc/cni/net.d on all the masters/nodes, rebooting, and re-ran installation worked for me.
Note: I had three calico files in there from a previous installation attempt using calico. It also did not have 80-openshift-network.conf in that directory.
Same Issue on a VSphere VM, even after trying every suggestion so far.
Thanks @ArturoArreola
Deleting everything in /etc/cni/net.d on all the masters/nodes, rebooting, and re-ran installation worked for me.
This also fixed the issue for me. Deleted files in /etc/cni/net.d/ on all nodes, rebooted all nodes, and reran the prerequisites and deploy_cluster playbooks. Not sure if rerunning prerequisites was necessary, but it didn't hurt.
Hit this issue but the /etc/cni/net.d directory was empty when I looked to delete anything in it and reboot.
Dupe of #10969
Hi
in my case, with RHEL 7.5 and OCP 3.11...
After clean the path /etc/cni/net.d and reboot nodes, I got this error
[ocpadmin@--- ~]$ sudo docker logs 657ff4ddf6bd
2019/04/13 08:51:11 socat[125450] E connect(5, AF=1 "/var/run/openshift-sdn/cni-server.sock", 40): No such file or directory
User "sa" set.
Context "default-context" modified.
which: no openshift-sdn in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin)
I0413 08:51:12.148125 125426 start_network.go:200] Reading node configuration from /etc/origin/node/node-config.yaml
I0413 08:51:12.150719 125426 start_network.go:207] Starting node networking --- (v3.11.98)
W0413 08:51:12.150893 125426 server.go:195] WARNING: all flags other than --config, --write-config-to, and --cleanup are deprecated. Please begin using a config file ASAP.
I0413 08:51:12.150939 125426 feature_gate.go:230] feature gates: &{map[]}
I0413 08:51:12.152503 125426 transport.go:160] Refreshing client certificate from store
I0413 08:51:12.152538 125426 certificate_store.go:131] Loading cert/key pair from "/etc/origin/node/certificates/kubelet-client-current.pem".
I0413 08:51:12.173098 125426 node.go:147] Initializing SDN node of type "redhat/openshift-ovs-networkpolicy" with configured hostname "---" (IP ""), iptables sync period "30s"
I0413 08:51:12.178064 125426 node.go:289] Starting openshift-sdn network plugin
F0413 08:51:12.222505 125426 network.go:46] SDN node startup failed: node SDN setup failed: net/ipv4/ip_forward=0, it must be set to 1
The error was really descriptive about how to solve it...
I did
sudo sysctl -w net.ipv4.ip_forward=1
in all nodes. After that, services (containers) setup right...
[ocpadmin@--- ~]$ sudo docker logs d642ce2cdd16
2019/04/13 13:52:12 socat[22784] E connect(5, AF=1 "/var/run/openshift-sdn/cni-server.sock", 40): No such file or directory
User "sa" set.
Context "default-context" modified.
which: no openshift-sdn in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin)
I0413 13:52:13.432526 22757 start_network.go:200] Reading node configuration from /etc/origin/node/node-config.yaml
I0413 13:52:13.434889 22757 start_network.go:207] Starting node networking --- (v3.11.98)
W0413 13:52:13.435053 22757 server.go:195] WARNING: all flags other than --config, --write-config-to, and --cleanup are deprecated. Please begin using a config file ASAP.
I0413 13:52:13.435097 22757 feature_gate.go:230] feature gates: &{map[]}
I0413 13:52:13.436525 22757 transport.go:160] Refreshing client certificate from store
I0413 13:52:13.436559 22757 certificate_store.go:131] Loading cert/key pair from "/etc/origin/node/certificates/kubelet-client-current.pem".
I0413 13:52:13.457083 22757 node.go:147] Initializing SDN node of type "redhat/openshift-ovs-networkpolicy" with configured hostname "---" (IP ""), iptables sync period "30s"
I0413 13:52:13.461645 22757 node.go:289] Starting openshift-sdn network plugin
I0413 13:52:13.499553 22757 sdn_controller.go:139] [SDN setup] full SDN setup required (Link not found)
I0413 13:52:13.716938 22757 vnids.go:148] Associate netid 0 to namespace "default" with mcEnabled false
I0413 13:52:13.716963 22757 vnids.go:148] Associate netid 8826639 to namespace "kube-public" with mcEnabled false
I0413 13:52:13.716969 22757 vnids.go:148] Associate netid 15881286 to namespace "kube-system" with mcEnabled false
I0413 13:52:13.716974 22757 vnids.go:148] Associate netid 7199253 to namespace "management-infra" with mcEnabled false
I0413 13:52:13.716987 22757 vnids.go:148] Associate netid 4301137 to namespace "openshift" with mcEnabled false
I0413 13:52:13.716994 22757 vnids.go:148] Associate netid 7558408 to namespace "openshift-infra" with mcEnabled false
I0413 13:52:13.717000 22757 vnids.go:148] Associate netid 4191676 to namespace "openshift-logging" with mcEnabled false
I0413 13:52:13.717004 22757 vnids.go:148] Associate netid 5736859 to namespace "openshift-monitoring" with mcEnabled false
I0413 13:52:13.717013 22757 vnids.go:148] Associate netid 16042816 to namespace "openshift-node" with mcEnabled false
I0413 13:52:13.717018 22757 vnids.go:148] Associate netid 2480811 to namespace "openshift-sdn" with mcEnabled false
I0413 13:52:13.717023 22757 vnids.go:148] Associate netid 4550475 to namespace "openshift-web-console" with mcEnabled false
I0413 13:52:13.747827 22757 node.go:348] Starting openshift-sdn pod manager
E0413 13:52:13.750540 22757 cniserver.go:148] failed to remove old pod info socket: remove /var/run/openshift-sdn: device or resource busy
E0413 13:52:13.750614 22757 cniserver.go:151] failed to remove contents of socket directory: remove /var/run/openshift-sdn: device or resource busy
I0413 13:52:13.756884 22757 node.go:392] openshift-sdn network plugin registering startup
I0413 13:52:13.757080 22757 node.go:410] openshift-sdn network plugin ready
I0413 13:52:13.759593 22757 network.go:95] Using iptables Proxier.
I0413 13:52:13.761317 22757 networkpolicy.go:286] SyncVNIDRules: 0 unused VNIDs
W0413 13:52:13.761852 22757 proxier.go:298] missing br-netfilter module or unset sysctl br-nf-call-iptables; proxy may not work as intended
I0413 13:52:13.761949 22757 network.go:131] Tearing down userspace rules.
I0413 13:52:13.787101 22757 proxier.go:189] Setting proxy IP to 10.71.87.139 and initializing iptables
I0413 13:52:13.809249 22757 proxy.go:82] Starting multitenant SDN proxy endpoint filter
I0413 13:52:13.809377 22757 config.go:202] Starting service config controller
I0413 13:52:13.809398 22757 controller_utils.go:1025] Waiting for caches to sync for service config controller
I0413 13:52:13.813988 22757 config.go:102] Starting endpoints config controller
I0413 13:52:13.814002 22757 controller_utils.go:1025] Waiting for caches to sync for endpoints config controller
I0413 13:52:13.814045 22757 network.go:239] Started Kubernetes Proxy on 0.0.0.0
I0413 13:52:13.815125 22757 network.go:53] Starting DNS on 127.0.0.1:53
I0413 13:52:13.817420 22757 server.go:76] Monitoring dnsmasq to point cluster queries to 127.0.0.1
I0413 13:52:13.817506 22757 logs.go:49] skydns: ready for queries on cluster.local. for tcp://127.0.0.1:53 [rcache 0]
I0413 13:52:13.817520 22757 logs.go:49] skydns: ready for queries on cluster.local. for udp://127.0.0.1:53 [rcache 0]
I0413 13:52:13.818280 22757 roundrobin.go:276] LoadBalancerRR: Setting endpoints for default/kubernetes:https to [10.71.87.135:8443 10.71.87.138:8443]
I0413 13:52:13.818371 22757 roundrobin.go:276] LoadBalancerRR: Setting endpoints for default/kubernetes:dns-tcp to [10.71.87.135:8053 10.71.87.138:8053]
I0413 13:52:13.818385 22757 roundrobin.go:276] LoadBalancerRR: Setting endpoints for default/kubernetes:dns to [10.71.87.135:8053 10.71.87.138:8053]
I0413 13:52:13.909542 22757 controller_utils.go:1032] Caches are synced for service config controller
I0413 13:52:13.909628 22757 proxier.go:629] Not syncing iptables until Services and Endpoints have been received from master
I0413 13:52:13.914112 22757 controller_utils.go:1032] Caches are synced for endpoints config controller
I0413 13:52:13.914192 22757 service.go:314] Adding new service port "default/router:1936-tcp" at 10.6.238.197:1936/TCP
I0413 13:52:13.914216 22757 service.go:314] Adding new service port "default/router:80-tcp" at 10.6.238.197:80/TCP
I0413 13:52:13.914226 22757 service.go:314] Adding new service port "default/router:443-tcp" at 10.6.238.197:443/TCP
I0413 13:52:13.914236 22757 service.go:314] Adding new service port "openshift-web-console/webconsole:https" at 10.6.148.61:443/TCP
I0413 13:52:13.914249 22757 service.go:314] Adding new service port "default/docker-registry:5000-tcp" at 10.6.164.93:5000/TCP
I0413 13:52:13.914262 22757 service.go:314] Adding new service port "default/kubernetes:dns-tcp" at 10.6.0.1:53/TCP
I0413 13:52:13.914272 22757 service.go:314] Adding new service port "default/kubernetes:https" at 10.6.0.1:443/TCP
I0413 13:52:13.914282 22757 service.go:314] Adding new service port "default/kubernetes:dns" at 10.6.0.1:53/UDP
I0413 13:52:13.914291 22757 service.go:314] Adding new service port "default/registry-console:registry-console" at 10.6.182.175:9000/TCP
I0413 13:52:13.914314 22757 proxier.go:643] Stale udp service default/kubernetes:dns -> 10.6.0.1
[ocpadmin@--- ~]$
for me below did worked. For each host:
vi /etc/selinux/config
SELINUX=enforcing
SELINUXTYPE=targeted
vi /etc/hosts with appropriate entries.
vi /etc/sysconfig/network-scripts/ifcfg-eth1
NM_CONTROLLED=yes
PEERDNS=yes
hostnamectl set-hostname
reboot
yum update
I have the same issure install OpenShift3.11 with Rhel 7.7
FAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (30 retries left).
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 23m default-scheduler Successfully assigned openshift-monitoring/cluster-monitoring-operator-66d586cd9f-5pl72 to master3.cn.cjcc.com
Warning FailedCreatePodSandBox 23m kubelet, master3.cn.cjcc.com Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "6760020b38e3544a997adb8b671c001615eeadc323b04107264b26e698c95712" network for pod "cluster-monitoring-operator-66d586cd9f-5pl72": NetworkPlugin cni failed to set up pod "cluster-monitoring-operator-66d586cd9f-5pl72_openshift-monitoring" network: OpenShift SDN network process is not (yet?) available
Warning FailedCreatePodSandBox 23m kubelet, master3.cn.cjcc.com Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "f040789aa821d4494eb0431b2ed1f32c95ae5fab8c6f0d3b6c90b30f2058fc36" network for pod "cluster-monitoring-operator-66d586cd9f-5pl72": NetworkPlugin cni failed to set up pod "cluster-monitoring-operator-66d586cd9f-5pl72_openshift-monitoring" network: OpenShift SDN network process is not (yet?) available
Warning FailedCreatePodSandBox 23m kubelet, master3.cn.cjcc.com Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "4ff8377aca1beb5123fde0dce1d9c80488fadd3a8d9f820325083ed700ddc330" network for pod "cluster-monitoring-operator-66d586cd9f-5pl72": NetworkPlugin cni failed to set up pod "cluster-monitoring-operator-66d586cd9f-5pl72_openshift-monitoring" network: OpenShift SDN network process is not (yet?) available
Warning FailedCreatePodSandBox 23m kubelet, master3.cn.cjcc.com Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "e22ade8b9d7e53129d9a85c0bb5fd90dc36b8b062aa175482177b78368992c1b" network for pod "cluster-monitoring-operator-66d586cd9f-5pl72": NetworkPlugin cni failed to set up pod "cluster-monitoring-operator-66d586cd9f-5pl72_openshift-monitoring" network: OpenShift SDN network process is not (yet?) available
Warning FailedCreatePodSandBox 23m kubelet, master3.cn.cjcc.com Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "3652e850fbd3ea489588402a08589271c5f2b2b889ded77255bf2e55a2385657" network for pod "cluster-monitoring-operator-66d586cd9f-5pl72": NetworkPlugin cni failed to set up pod "cluster-monitoring-operator-66d586cd9f-5pl72_openshift-monitoring" network: OpenShift SDN network process is not (yet?) available
Warning FailedCreatePodSandBox 23m kubelet, master3.cn.cjcc.com Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "79c2f30ea8c50719cd7d955a091c7c6d75eff8c7adb4ff35be0ae163d0a0186e" network for pod "cluster-monitoring-operator-66d586cd9f-5pl72": NetworkPlugin cni failed to set up pod "cluster-monitoring-operator-66d586cd9f-5pl72_openshift-monitoring" network: OpenShift SDN network process is not (yet?) available
Warning FailedCreatePodSandBox 22m kubelet, master3.cn.cjcc.com Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "2157c36e309a663a029afeb672250e3b12fe80b4168ee8a3a96947a231de1ba2" network for pod "cluster-monitoring-operator-66d586cd9f-5pl72": NetworkPlugin cni failed to set up pod "cluster-monitoring-operator-66d586cd9f-5pl72_openshift-monitoring" network: OpenShift SDN network process is not (yet?) available
Warning FailedCreatePodSandBox 22m kubelet, master3.cn.cjcc.com Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "1e452c43da97a71742cef53646bb02778027fc949c9a2c092a5a35e569648f41" network for pod "cluster-monitoring-operator-66d586cd9f-5pl72": NetworkPlugin cni failed to set up pod "cluster-monitoring-operator-66d586cd9f-5pl72_openshift-monitoring" network: OpenShift SDN network process is not (yet?) available
Warning FailedCreatePodSandBox 22m kubelet, master3.cn.cjcc.com Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "cc6def1fe654506f0e6f3fa27b70887b64c61725f881102af709bcaa77f3e45a" network for pod "cluster-monitoring-operator-66d586cd9f-5pl72": NetworkPlugin cni failed to set up pod "cluster-monitoring-operator-66d586cd9f-5pl72_openshift-monitoring" network: OpenShift SDN network process is not (yet?) available
Warning FailedCreatePodSandBox 8m (x141 over 22m) kubelet, master3.cn.cjcc.com (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "940f9202731f08314c73d404fef35740d7d318942fc86e439db7ff3338650276" network for pod "cluster-monitoring-operator-66d586cd9f-5pl72": NetworkPlugin cni failed to set up pod "cluster-monitoring-operator-66d586cd9f-5pl72_openshift-monitoring" network: OpenShift SDN network process is not (yet?) available
Normal SandboxChanged 3m (x211 over 23m) kubelet, master3.cn.cjcc.com Pod sandbox changed, it will be killed and re-created.
Most helpful comment
Thanks @ArturoArreola
Deleting everything in /etc/cni/net.d on all the masters/nodes, rebooting, and re-ran installation worked for me.
Note: I had three calico files in there from a previous installation attempt using calico. It also did not have 80-openshift-network.conf in that directory.