Local VMs deployment fails with "NetworkManager must be installed and enabled prior to installation" even though NetworkManager is running on the host machine.
ansible 2.3.1.0
config file = /root/openshift-ansible/ansible.cfg
configured module search path = Default w/o overrides
python version = 2.7.13 (default, Jun 26 2017, 10:20:05) [GCC 7.1.1 20170622 (Red Hat 7.1.1-3)]
openshift-ansible-3.7.1-1-40-g3be2748d
bin/cluster create libvirt lenaicSuccessfully deployed OpenShift.
Task openshift_node_dnsmasq fails with:
TASK [openshift_node_dnsmasq : fail] ***************************************************************************************************************************
Monday 31 July 2017 13:41:55 +0200 (0:00:00.869) 0:46:40.076 ***********
fatal: [lenaic-node-compute-5db33]: FAILED! => {
"changed": false,
"failed": true
}
MSG:
Currently, NetworkManager must be installed and enabled prior to installation.
fatal: [lenaic-node-compute-12c02]: FAILED! => {
"changed": false,
"failed": true
}
MSG:
Currently, NetworkManager must be installed and enabled prior to installation.
fatal: [lenaic-node-infra-e10d2]: FAILED! => {
"changed": false,
"failed": true
}
MSG:
Currently, NetworkManager must be installed and enabled prior to installation.
PLAY RECAP *****************************************************************************************************************************************************
lenaic-master-2ab2c : ok=327 changed=69 unreachable=0 failed=0
lenaic-node-compute-12c02 : ok=208 changed=40 unreachable=0 failed=1
lenaic-node-compute-5db33 : ok=212 changed=43 unreachable=0 failed=1
lenaic-node-infra-e10d2 : ok=208 changed=40 unreachable=0 failed=1
localhost : ok=96 changed=45 unreachable=0 failed=0
Monday 31 July 2017 13:41:55 +0200 (0:00:00.158) 0:46:40.234 ***********
===============================================================================
os_update_latest : Update all packages ------------------------------- 1542.56s
docker : Install Docker ----------------------------------------------- 169.61s
openshift_common : Install the base package for versioning ------------ 105.35s
docker : Install Docker ------------------------------------------------ 70.03s
openshift_common : Install the base package for versioning ------------- 50.87s
Wait for the VMs to get an IP ------------------------------------------ 39.08s
Wait for the VMs to get an IP ------------------------------------------ 38.80s
Wait for the VMs to get an IP ------------------------------------------ 22.08s
openshift_facts : Ensure various deps are installed -------------------- 20.78s
openshift_excluder : Install docker excluder --------------------------- 11.41s
openshift_excluder : Install openshift excluder ------------------------ 11.13s
openshift_docker_facts : Set docker facts ------------------------------ 10.73s
openshift_docker_facts : Set docker facts ------------------------------ 10.01s
openshift_manageiq : Configure role/user permissions -------------------- 9.98s
openshift_master : Start and enable master ------------------------------ 9.23s
docker : Start the Docker service --------------------------------------- 8.49s
openshift_facts : Gather Cluster facts and set is_containerized if needed --- 7.85s
openshift_docker_facts : Set docker facts ------------------------------- 7.51s
openshift_master : Create master config --------------------------------- 7.04s
openshift_facts --------------------------------------------------------- 6.96s
Failure summary:
1. Host: lenaic-node-compute-5db33
Play: Configure nodes
Task: openshift_node_dnsmasq : fail
Message: Currently, NetworkManager must be installed and enabled prior to installation.
2. Host: lenaic-node-compute-12c02
Play: Configure nodes
Task: openshift_node_dnsmasq : fail
Message: Currently, NetworkManager must be installed and enabled prior to installation.
3. Host: lenaic-node-infra-e10d2
Play: Configure nodes
Task: openshift_node_dnsmasq : fail
Message: Currently, NetworkManager must be installed and enabled prior to installation.
ACTION [create] failed: Command 'ansible-playbook -i inventory/libvirt/hosts -e 'num_masters=1 num_nodes=2 cluster_id=lenaic cluster_env=dev num_etcd=0 num_infra=1 deployment_type=origin' playbooks/libvirt/openshift-cluster/launch.yml' returned non-zero exit status 2
Host system: Fedora release 26 (Twenty Six)
@phoracek what was the output of systemctl show NetworkManager on the failed node?
This is the part where it fails from the playbook(openshift-ansible/playbooks/byo/roles/openshift_node_dnsmasq/tasks/main.yml):
- name: Check for NetworkManager service
command: >
systemctl show NetworkManager
register: nm_show
changed_when: false
ignore_errors: True
- name: Set fact using_network_manager
set_fact:
network_manager_active: "{{ True if 'ActiveState=active' in nm_show.stdout else False }}"
I'm expiriencing exactly same failure. openshift-origin, release-3.6 branch, deploying on CentOS 7.3.
This is output on freshly created node.
[root@ocp-node-3 ~]# systemctl show NetworkManager | grep ActiveState
ActiveState=inactive
Full output - https://gist.github.com/bacek/fd8a4e141e124b51c75802a77ceac02d
"NetworkManager must be installed" is coming directly from https://github.com/openshift/openshift-ansible/blob/master/roles/openshift_node_dnsmasq/tasks/no-network-manager.yml
@bacek in your case, you need to set the NetworkManager to active.
systemctl enable NetworkManager; systemctl start NetworkManager
The task for NM was returned False when your NM is inactive or disabled.
This is where the checks are:
https://github.com/openshift/openshift-ansible/issues/4950#issuecomment-321499515
@aizuddin85 I did it as "workaround". But my expectations are that openshift-ansible will install and enable NM if needed.
@bacek, looking back at default CentOS 7 or RHEL 7 fresh installation, NetworkManager and firewalld are on and enabled by default.
Enabling firewalld and NM without knowing underlying configuration can cause configuration conflict. I think it is best to keep the playbook simple without having extensive checks.
Could this check be done by the prerequisites.yml playbook? It seems strange to be able to run through the prerequisites and then fail on this when running deploy_cluster.yml.
In my case the error wasn't solved. I'm afraid that I have the same error but the command to check NetworkManager status return to me the correct value:
<myprompt>: sudo systemctl show NetworkManager | grep ActiveState
ActiveState=active
The only way to continue the install was force the fact to true in dnsmasq_install.yaml code. But then I obtain the next error:
Message: The conditional check 'result.rc == 0' failed. The error was: error while evaluating conditional (result.rc == 0): 'dict object' has no attribute 'rc'
And from this point I can't continue install/deploy process. Somebody know why is not detecting NetworkManager?, and at second place, Why I get the second error (result.rc related)?Is related to my modification or it could be originated by other reason ?驴
Thanks in advance for help,
R,
Chema
P.S: The "rc error" was reported at this point in the next playbook executionpoint
TASK [openshift_node : Restart journald]
fatal:[ip node] FAILED! => {"msg": "The conditional check 'result.rc == 0' failed. The error was: error while evaluating conditional (result.rc == 0): 'dict object' has no attribute 'rc'"}
NOTE - result.rc error "solved": After I checked that the journald operations (error rc mentioned) didn't support "check mode in ansible" and when I executed the playbook without "dry run" it could continue without any problems. It means that my changes related to dnsmasq_install.yml were sucessful . The only changes that I did were comment the next file first lines in dnsmasq_install.yaml , also I forced the network_manager_active to "True" (Because I know it was running correctly as I show at the beginning of this message):
#- name: Check for NetworkManager service
# command: >
# systemctl show NetworkManager
# register: nm_show
# changed_when: false
# ignore_errors: True <--- I think that this could be the cause of the problem. Maybe the command to check NetworkManager fails and after that the "later fact" is not set correctly. Although in my case it's running ok and manual execution show as "active"
- name: Set fact using_network_manager
set_fact:
#network_manager_active: "{{ True if 'ActiveState=active' in nm_show.stdout else False }}"
network_manager_active: True
(...)
I've had to create out of tree playbooks to ensure that NetworkManager is installed and has been restarted via systemctl.... despite seeing that there is a playbook that is supposed to be doing this
TASK [Check for NetworkManager service] ****************************************************************************
ok: [ose-node-02]
ok: [ose-node-00]
ok: [ose-master]
ok: [ose-node-01]
ok: [ose-node-03]
ok: [ose-node-05]
ok: [ose-node-04]
TASK [fail] ********************************************************************************************************
fatal: [ose-master]: FAILED! => {"changed": false, "msg": "Currently, NetworkManager must be installed and enabled prior to installation."}
fatal: [ose-node-00]: FAILED! => {"changed": false, "msg": "Currently, NetworkManager must be installed and enabled prior to installation."}
fatal: [ose-node-01]: FAILED! => {"changed": false, "msg": "Currently, NetworkManager must be installed and enabled prior to installation."}
fatal: [ose-node-02]: FAILED! => {"changed": false, "msg": "Currently, NetworkManager must be installed and enabled prior to installation."}
fatal: [ose-node-03]: FAILED! => {"changed": false, "msg": "Currently, NetworkManager must be installed and enabled prior to installation."}
fatal: [ose-node-04]: FAILED! => {"changed": false, "msg": "Currently, NetworkManager must be installed and enabled prior to installation."}
fatal: [ose-node-05]: FAILED! => {"changed": false, "msg": "Currently, NetworkManager must be installed and enabled prior to installation."}
to retry, use: --limit @/Users/scolli572/src/openshift/openshift-ansible/playbooks/deploy_cluster.retry
PLAY RECAP *********************************************************************************************************
localhost : ok=11 changed=0 unreachable=0 failed=0
ose-master : ok=39 changed=0 unreachable=0 failed=1
ose-node-00 : ok=19 changed=0 unreachable=0 failed=1
ose-node-01 : ok=18 changed=0 unreachable=0 failed=1
ose-node-02 : ok=18 changed=0 unreachable=0 failed=1
ose-node-03 : ok=19 changed=0 unreachable=0 failed=1
ose-node-04 : ok=18 changed=0 unreachable=0 failed=1
ose-node-05 : ok=18 changed=0 unreachable=0 failed=1
INSTALLER STATUS ***************************************************************************************************
Initialization : In Progress (0:01:45)
Failure summary:
1. Hosts: ose-master, ose-node-00, ose-node-01, ose-node-02, ose-node-03, ose-node-04, ose-node-05
Play: Verify Node NetworkManager
Task: fail
Message: Currently, NetworkManager must be installed and enabled prior to installation.
if you label master or any other hosts as schedulable you need to first install and then enable the NetworkManager package:
# yum install NetworkManager
# systemctl enable NetworkManager
# systemctl start NetworkManager
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
/lifecycle rotten
/remove-lifecycle stale
Rotten issues close after 30d of inactivity.
Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.
/close
@openshift-bot: Closing this issue.
In response to this:
Rotten issues close after 30d of inactivity.
Reopen the issue by commenting
/reopen.
Mark the issue as fresh by commenting/remove-lifecycle rotten.
Exclude this issue from closing again by commenting/lifecycle frozen./close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Most helpful comment
Could this check be done by the prerequisites.yml playbook? It seems strange to be able to run through the prerequisites and then fail on this when running deploy_cluster.yml.