As of today (19/03) I am unable to create a non containerized Openshift Origin using the byo/config.yml playbook.
The playbook is run against a clean CentOS 7.4 VM.
The very high level explanation I can come up with is that with the advent of the 3.7.1 RPMs in http://mirror.centos.org/centos/7/paas/x86_64/openshift-origin37/ (which seem to have landed today), the playbook now fails due to mixing different RPM versions
ansible version: ansible 2.4.3.0
git describe: openshift-ansible-3.7.39-1
Up until yesterday (when only 3.7.0 RPMs were present in the CentOS repo), I was able to create the cluster without issues.
INSTALLER STATUS ***************************************************************************************************************************
Initialization : Complete
Health Check : Complete
etcd Install : Complete
Master Install : Complete
Master Additional Install : Complete
Node Install : In Progress
This phase can be restarted by running: playbooks/byo/openshift-node/config.yml
Failure summary:
1. Hosts: 192.168.99.50
Play: Configure nodes
Task: Install sdn-ovs package
Message: Error: Package: origin-sdn-ovs-3.7.0-1.0.7ed6862.x86_64 (centos-openshift-origin37)
Requires: origin-node = 3.7.0-1.0.7ed6862
Installed: origin-node-3.7.1-1.el7.git.0.0a2d6a1.x86_64 (@centos-openshift-origin37)
origin-node = 3.7.1-1.el7.git.0.0a2d6a1
Available: origin-node-3.7.0-1.0.7ed6862.x86_64 (centos-openshift-origin37)
origin-node = 3.7.0-1.0.7ed6862
[OSEv3:children]
masters
nodes
etcd
[OSEv3:vars]
ansible_user=root
public_ip_address = 192.168.99.50
host_key_checking = False
containerized = false
openshift_release=v3.7
openshift_pkg_version=-3.7.0
openshift_deployment_type=origin
openshift_hostname=192.168.99.50
openshift_master_cluster_public_hostname=192.168.99.50
openshift_master_default_subdomain=192.168.99.50.nip.io
openshift_master_unsupported_embedded_etcd=true
openshift_disable_check = docker_storage,memory_availability,disk_availability,docker_image_availability,package_version
openshift_enable_service_catalog=false
ansible_python_interpreter=/usr/bin/python
[masters]
192.168.99.50 openshift_public_hostname=192.168.99.50 openshift_ip=192.168.99.50
[etcd]
192.168.99.50 openshift_ip=192.168.99.50
[nodes]
192.168.99.50 openshift_node_labels="{'region':'infra','zone':'default'}" openshift_public_hostname=192.168.99.50 openshift_schedulable=true openshift_ip=192.168.99.50
EXTRA INFORMATION GOES HERE
It should be noted that I also tried configuring the playbook to use 3.7.1 RPMs by setting:
openshift_release=v3.7.1
openshift_pkg_version=-3.7.1
openshift_image_tag=v3.7.1
Unfortunately in that case I had a different problem that occurred even earlier in the installation process.
The specific error was:
TASK [openshift_master_facts : Set Default scheduler predicates and priorities] ************************************************************
fatal: [192.168.99.50]: FAILED! => {"msg": "An unhandled exception occurred while running the lookup plugin 'openshift_master_facts_default_predicates'. Error was a <class 'ansible.errors.AnsibleError'>, original message: Unknown short_version 3.10"}
Please let me know if I can provide any further information.
Thank you
I also hit exactly the same problem today.
Key things were:
need to add this property to get things moving:
openshift_pkg_version=-3.7.1
Once set hit the Unknown short_version 3.10 error.
Seems something got badly broken.
This was using release-3.7 branch of openshift-ansible.
2 remarks :
containerized and non containerized environment how rpms packages are resolved and if not yet there, how they are downloaded (mirror server for centos, fedora, rhel)Unknown short_version 3.10 error.
Here is the detail
ASK [openshift_master_facts : Set Default scheduler predicates and priorities] **********************************************************************************************************************************************************************
task path: /Users/dabou/Code/rhoar/cloud-native/infra/ansible/openshift-ansible/roles/openshift_master_facts/tasks/main.yml:110
fatal: [192.168.99.50]: FAILED! => {
"msg": "An unhandled exception occurred while running the lookup plugin 'openshift_master_facts_default_predicates'. Error was a <class 'ansible.errors.AnsibleError'>, original message: Unknown short_version 3.10"
}
PLAY RECAP *******************************************************************************************************************************************************************************************************************************************
192.168.99.50 : ok=108 changed=5 unreachable=0 failed=1
localhost : ok=11 changed=0 unreachable=0 failed=0
INSTALLER STATUS *************************************************************************************************************************************************************************************************************************************
Initialization : Complete
Health Check : Complete
etcd Install : Complete
Master Install : In Progress
This phase can be restarted by running: playbooks/byo/openshift-master/config.yml
Failure summary:
1. Hosts: 192.168.99.50
Play: Create OpenShift certificates for master hosts
Task: Set Default scheduler predicates and priorities
Message: An unhandled exception occurred while running the lookup plugin 'openshift_master_facts_default_predicates'. Error was a <class 'ansible.errors.AnsibleError'>, original message: Unknown short_version 3.10
Full debugging output would be helpful. The role openshift_version is responsible for setting openshift_version among other version-related variables. openshift.common.short_version is set by openshift_facts.
Also, can you post the value of openshift.common.short_version from the file /etc/ansible/facts/openshift.fact here?
I see this in the facts for the master node:
"short_version": "3.10"
@michaelgugino You can see the full output here
@michaelgugino The 3.10 version only shows up when using 3.7.1. When using using 3.7.0 then the error is
Message: Error: Package: origin-sdn-ovs-3.7.0-1.0.7ed6862.x86_64 (centos-openshift-origin37)
Requires: origin-node = 3.7.0-1.0.7ed6862
Installed: origin-node-3.7.1-1.el7.git.0.0a2d6a1.x86_64 (@centos-openshift-origin37)
origin-node = 3.7.1-1.el7.git.0.0a2d6a1
Available: origin-node-3.7.0-1.0.7ed6862.x86_64 (centos-openshift-origin37)
origin-node = 3.7.0-1.0.7ed6862
The output from /etc/ansible/facts.d/openshift.fact is
{
"node": {
"schedulable": "true",
"labels": {
"region": "infra",
"zone": "default"
},
"dns_ip": "10.0.3.15",
"proxy_mode": "iptables",
"kubelet_args": {
"pods-per-core": [
"20"
]
}
},
"builddefaults": {
"config": {
"BuildDefaults": {
"configuration": {
"apiVersion": "v1",
"kind": "BuildDefaultsConfig",
"env": [
{
"name": "HTTP_PROXY",
"value": ""
},
{
"name": "HTTPS_PROXY",
"value": ""
},
{
"name": "NO_PROXY",
"value": ""
},
{
"name": "http_proxy",
"value": ""
},
{
"name": "https_proxy",
"value": ""
},
{
"name": "no_proxy",
"value": ""
}
],
"resources": {
"requests": {},
"limits": {}
}
}
}
}
},
"logging": {
"elasticsearch": {
"pvc": {},
"ops": {
"pvc": {}
}
}
},
"cloudprovider": {},
"master": {
"admission_plugin_config": {
"openshift.io/ImagePolicy": {
"configuration": {
"kind": "ImagePolicyConfig",
"executionRules": [
{
"skipOnResolutionFailure": true,
"matchImageAnnotations": [
{
"key": "images.openshift.io/deny-execution",
"value": "true"
}
],
"reject": true,
"name": "execution-denied",
"onResources": [
{
"resource": "pods"
},
{
"resource": "builds"
}
]
}
],
"apiVersion": "v1"
}
}
},
"named_certificates": [],
"cluster_public_hostname": "192.168.99.50",
"identity_providers": [
{
"name": "htpasswd_auth",
"login": "true",
"challenge": "true",
"kind": "HTPasswdPasswordIdentityProvider",
"filename": "/etc/origin/master/htpasswd"
}
],
"etcd_hosts": [
"192.168.99.50"
],
"manage_htpasswd": true,
"session_secrets_file": "/etc/origin/master/session-secrets.yaml",
"master_count": "1",
"cluster_method": "native",
"etcd_port": "2379",
"session_encryption_secrets": [
"EDOyxF7Yn3THeN4Dl1agIv4iVb2blCs+"
],
"ha": false,
"htpasswd_users": {
"admin": "$apr1$DloeoaY3$nqbN9fQBkyXgbj58buqEM."
},
"session_auth_secrets": [
"EDOyxF7Yn3THeN4Dl1agIv4iVb2blCs+"
]
},
"common": {
"etcd_runtime": "host",
"is_etcd_system_container": false,
"ip": "192.168.99.50",
"hostname": "192.168.99.50",
"deployment_subtype": "basic",
"is_master_system_container": false,
"is_containerized": false,
"is_node_system_container": false,
"system_images_registry": "docker.io",
"generate_no_proxy_hosts": true,
"is_openvswitch_system_container": false,
"no_proxy_etcd_host_ips": "192.168.99.50",
"public_hostname": "192.168.99.50",
"deployment_type": "origin"
},
"etcd": {},
"docker": {
"hosted_registry_network": "172.30.0.0/16",
"use_crio": false,
"hosted_registry_insecure": false,
"use_system_container": false
},
"buildoverrides": {
"config": {
"BuildOverrides": {
"configuration": {
"kind": "BuildOverridesConfig",
"apiVersion": "v1"
}
}
}
}
}
From what I can see packages for v3.7.1-1 actually contain binaries for v3.10.0-alpha.0
@jfchevrette Thank you for the update.
Actually this issue isn't really about 3.7.1, but rather about 3.7.0.
The problems with 3.7.1 where mentioned just to give the complete picture:
I had a problem with 3.7.0 (which is what all the debugging output is from) -> I tried 3.7.1 to see if I can get around it -> No luck, blocked on both 3.7.0 and 3.7.1
Also unable to install. Can confirm the RPM has the wrong versioned binaries in it:
# rpm -q origin
origin-3.7.1-1.el7.git.0.0a2d6a1.x86_64
# origin version
origin v3.10.0-alpha.0+0a2d6a1-65
kubernetes v1.9.1+a0ce1bc657
etcd 3.2.16
Does not appear to be any simple workaround as the yum repo does not contain any alternate versions of the (3.7.1) package:
# yum --showduplicates list origin
Installed Packages
origin.x86_64 3.7.1-1.el7.git.0.0a2d6a1 @centos-openshift-origin37
Available Packages
origin.x86_64 3.7.0-1.0.7ed6862 centos-openshift-origin37
origin.x86_64 3.7.1-1.el7.git.0.0a2d6a1 centos-openshift-origin37
I tried debugging this issue further and I found the following information that looks interesting to me (forgive me if it's a totally wrong conclusion, since I'm by no means a yum/rpm expert):
When I run:
repoquery --requires --resolve origin-node-3.7.0
I get the following output:
ethtool-2:4.8-1.el7.x86_64
origin-0:3.7.0-1.0.7ed6862.x86_64
bash-0:4.2.46-29.el7_4.x86_64
util-linux-0:2.23.2-43.el7_4.2.x86_64
docker-2:1.12.6-48.git0fdc778.el7.centos.x86_64
util-linux-0:2.23.2-43.el7_4.2.i686
conntrack-tools-0:1.4.4-3.el7_3.x86_64
tuned-profiles-origin-node-0:3.7.0-1.0.7ed6862.x86_64
systemd-0:219-42.el7_4.10.x86_64
nfs-utils-1:1.3.0-0.48.el7.x86_64
origin-node-0:3.7.1-1.el7.git.0.0a2d6a1.x86_64
device-mapper-persistent-data-0:0.7.0-0.1.rc6.el7.x86_64
socat-0:1.7.3.2-2.el7.x86_64
I am very surprised to see origin-node-0:3.7.1-1.el7.git.0.0a2d6a1.x86_64 in the output above and I am guessing that it's causing all the problems.
Can someone with more knowledge please take a look?
Thanks
I'm definitely having the same issue. I was able to build a new cluster from scratch on 14 March (5 days ago) and it built minus having to add openshift_disable_check=package_version due to a docker release. Today I tried to build another cluster and kept receiving an error about my version didn't match the latest of 3.10.
Example:
You requested openshift_release 3.7, which is not matched by
the latest OpenShift RPM we detected as origin-3.10.0
on host master-1-openshift-test.isc.local.
We will only install the latest RPMs, so please ensure you are getting the release
you expect. You may need to adjust your Ansible inventory, modify the repositories
available on the host, or run the appropriate OpenShift upgrade playbook
This function is what was failing the deployment due to the results of openshift version being
openshift v3.10.0-alpha.0+0a2d6a1-65
kubernetes v1.9.1+a0ce1bc657
etcd 3.2.16
I had already manually verified the origin version to be origin.x86_64 0:3.7.1-1.el7.git.0.0a2d6a1. I also verified that the openshift binary being run was provided by origin.x86_64 0:3.7.1-1.el7.git.0.0a2d6a1 as well, and that everything else that rpm provided was also part of the 3.10-alpha version (or at least tagged as such).
To beat a dead horse, it is definitely related to the RPM. Version 3.7.1 was released from origin on 16 January 2018 (proof), but the only available RPM for that specific build of 3.7 is from 05 March 2018 and can be seen here. It's just that it at least reports that it has 3.10-alpha binaries in it. Rebuild is probably the best path forward, but I don't know how often RPMs get built, and if they are part of a separate project.
There was a problem with the releases centos 3.7.1 packages.
They have been rebuilt, and passed our initial tests. They should be in the released centos repositories in a few days.
Thank you for your patience with the dead horse.
Just wanted to add, I'm also impacted by this. I've not found a temporary workaround, but perhaps I'm not setting the correct parameters in my inventory file.
Does anyone know a (temporary) workaround?
I think at this point your only option is to build a custom repo with the 3.7.1-2 RPMs (which reportedly have fixed the issue) from either https://buildlogs.centos.org/centos/7/paas/x86_64/openshift-origin/ or http://cbs.centos.org/kojifiles/packages/origin/3.7.1/2.el7/x86_64/
You can just dump the RPMs into a local directory, use createrepo on that directory, and then add it to /etc/yum.repos.d.
I tried to add a node to a OSO 3.7.0 Cluster. With openshift_pkg_version=-3.7.0 set.
I also encountered problems
I still figured out my problem: with http://mirror.centos.org/centos/7/paas/x86_64/openshift-origin/
yum deplist origin-node-3.7.0
Loaded plugins: langpacks, product-id
package: origin-node.x86_64 3.7.0-1.0.7ed6862
[...]
dependency: tuned-profiles-origin-node = 3.7.0-1.0.7ed6862
provider: tuned-profiles-origin-node.x86_64 3.7.0-1.0.7ed6862
provider: origin-node.x86_64 3.7.1-1.el7.git.0.0a2d6a1
You can successfully install the cluster running the following scripts before to start installation
wget http://cbs.centos.org/kojifiles/packages/origin/3.7.1/2.el7/x86_64/origin-3.7.1-2.el7.x86_64.rpm
wget http://cbs.centos.org/kojifiles/packages/origin/3.7.1/2.el7/x86_64/origin-clients-3.7.1-2.el7.x86_64.rpm
wget http://cbs.centos.org/kojifiles/packages/origin/3.7.1/2.el7/x86_64/origin-master-3.7.1-2.el7.x86_64.rpm
wget http://cbs.centos.org/kojifiles/packages/origin/3.7.1/2.el7/x86_64/origin-node-3.7.1-2.el7.x86_64.rpm
wget http://cbs.centos.org/kojifiles/packages/origin/3.7.1/2.el7/x86_64/tuned-profiles-origin-node-3.7.1-2.el7.x86_64.rpm
wget http://cbs.centos.org/kojifiles/packages/origin/3.7.1/2.el7/x86_64/origin-sdn-ovs-3.7.1-2.el7.x86_64.rpm
wget http://cbs.centos.org/kojifiles/packages/origin/3.7.1/2.el7/x86_64/origin-service-catalog-3.7.1-2.el7.x86_64.rpm
wget http://cbs.centos.org/kojifiles/packages/origin/3.7.1/2.el7/x86_64/origin-template-service-broker-3.7.1-2.el7.x86_64.rpm
wget http://cbs.centos.org/kojifiles/packages/origin/3.7.1/2.el7/x86_64/origin-dockerregistry-3.7.1-2.el7.x86_64.rpm
yum install *.rpm
Grr. I've been hit by something related to this issue.
You requested openshift_release 3.7.1, which is not matched by
the latest OpenShift RPM we detected as origin-3.10.0
on host xxxxxx.
We will only install the latest RPMs, so please ensure you are getting the release
you expect. You may need to adjust your Ansible inventory, modify the repositories
available on the host, or run the appropriate OpenShift upgrade playbook.
So, does @adawolfs's temp workaround work or is it best to wait for the centos 3.7.1 packages hit the released centos repositories? Other than repeatedly trying is there way to know when the RPMs are in the released centos repositories?
@nemonik you can just watch this site
Even if the Centos, Fedora or RHEL repos will resolve such dependencies mismatch (rpm downloaded for 3.7, 3.10,...), these action items are required
containerized and non containerized environment how rpms packages are resolved and if not yet there, how they are downloaded (mirror server for centos, fedora, rhel)openshift_version, openshift_package_version, openshift_release and openshift_image_tag and include real examples are required !
The updated origin-3.7.1-2.el7 is now available in all the regular repositories. This problem should be fixed now.
hi @tdawson
Thanks for the update,
so i should just run the ansible from the master now?
You should be able to, yes.
Hi tdawson,
Thanks for the Prompt response , much appreciate .
i am running the ansible 3_7 and getting the following error "changed": false, "msg": "OCP rpm version 3.6.1 is different from OCP image version 3.6.0"
I think its the same issue.
trying to upgrade from 3.6 to 3.7 .
The dependency for origin-node-3.7.0 is still broken.
yum deplist origin-node-3.7.0 ❌
dependency: tuned-profiles-origin-node = 3.7.0-1.0.7ed6862
provider: tuned-profiles-origin-node.x86_64 3.7.0-1.0.7ed6862
provider: origin-node.x86_64 3.7.1-1.el7.git.0.0a2d6a1
yum deplist origin-node-3.7.1 🆗
dependency: tuned-profiles-origin-node = 3.7.1-2.el7
provider: tuned-profiles-origin-node.x86_64 3.7.1-2.el7
provider: origin-node.x86_64 3.7.1-1.el7.git.0.0a2d6a1
seeing this error. upgrade from 3.6 to 3.7. setting 3.7 in inventory.
Message: Error: Package: origin-node-3.7.1-2.el7.x86_64 (centos-openshift-origin37)
Requires: tuned-profiles-origin-node = 3.7.1-2.el7
Available: origin-node-3.7.1-1.el7.git.0.0a2d6a1.x86_64 (centos-openshift-origin37)
tuned-profiles-origin-node
Available: origin-node-3.7.0-1.0.7ed6862.x86_64 (centos-openshift-origin37)
Not found
Installing: origin-node-3.7.1-2.el7.x86_64 (centos-openshift-origin37)
Not found
i've not been able to get the 3.6 to 3.7 upgrade to run with this config. i've set everything as requested, the repos still don't show all the right versions. is there a documented process to get this fixed. I've done git checkout agains a tag and release branch that i know worked up until recently.
i still get 1 of several errors, either the 3.10 error, or docker pkg version error, or tuned-profiles origin node (3.7.2) is not available.
i've tried the work around (installed all 3.7.2 pkgs from cbs.centos ) but still have had issues.
Please I need a working stable version that is known to work, does anyone has one for Centos 7?
Yes, this problem was fixed a couple of weeks ago and installs on Centos7 now work fine.
What we do is just specify this option in the ansible inventory file:
openshift_release=v3.7
No need to specify the openshift_image_tag or openshift_pkg_version properties.
And be on the release-3.7 branch of the openshift-ansible github repo.
I found that specfiying:
openshift_pkg_version=-3.7.1-2.el7
Yields a working deployment on CentOS 7
I had issue running the scale up playbook, when adding 3 nodes, that it was installing the origin-node-3.7.1-1.el7.git.0.0a2d6a1.x86_64.rpm and then would fail when trying to install origin-sdn-ovs because it was trying 3.7.0 and said that above pkg was installed. even though the 3.7 repo was enabled and the 3.7.1-2 pkgs were there too. To work around the scale up, i had to do run scaleup to that point where it failed, then run yum downgrade on my scaleup nodes to downgrade origin-node to 3.7.0, then rerun scaleup.
Any chance the 'bad' pkgs with 'git' in them, (like 3.7.1-1.el7.git.0.0a2d6a1.x86_64.rpm) will be removed from the CentOS repos?
all, can you please try to re-run the deployment as there were new origin rpms been promoted to the official centos repos yesterday (ie - v3.7.2) and we should no longer have any issues with mixed versions.
We do apologize for the inconvenience, we had few issues with our automation which should be fixed.
I am marking this issue closed because it looks like we've resolved this a few weeks ago.
If you continue to have problems, please re-open, or create a new issue.
Workin on this today -
I found I had to set this:
openshift_release=v3.7
openshift_pkg_version=v3.7
openshift_image_tag=v3.7
To get past that particular problem...
-Andy
Actually I found out i was still on the master branch... whoopsie :)
Yes, this has been resolved.
I can confirm that if you make sure you are on the correct branch and only set this openshift_release=v3.7
It works perfectly
Most helpful comment
There was a problem with the releases centos 3.7.1 packages.
They have been rebuilt, and passed our initial tests. They should be in the released centos repositories in a few days.
Thank you for your patience with the dead horse.