Origin: test_pull_request_origin_extended_conformance_install_update failing across the board

Created on 10 Nov 2017 · 25Comments · Source: openshift/origin

test_pull_request_origin_extended_conformance_install_update started failing across the board about 2hrs ago

https://openshift-gce-devel.appspot.com/builds/origin-ci-test/pr-logs/directory/test_pull_request_origin_extended_conformance_install_update

@simo5

componeninternal-tools kinbug prioritP1

Source

sjenning

All 25 comments

Fails here

Creating output directory: /tmp/tito
Tagging new version of openshift-ansible: 3.7.5-1 -> 3.7.6-1
Traceback (most recent call last):
  File "/usr/bin/tito", line 23, in <module>
    CLI().main(sys.argv[1:])
  File "/usr/lib/python2.7/site-packages/tito/cli.py", line 202, in main
    return module.main(argv)
  File "/usr/lib/python2.7/site-packages/tito/cli.py", line 666, in main
    return tagger.run(self.options)
  File "/usr/lib/python2.7/site-packages/tito/tagger/main.py", line 114, in run
    self._tag_release()
  File "/usr/lib/python2.7/site-packages/tito/tagger/main.py", line 136, in _tag_release
    self._check_tag_does_not_exist(self._get_new_tag(new_version))
  File "/usr/lib/python2.7/site-packages/tito/tagger/main.py", line 501, in _check_tag_does_not_exist
    raise Exception("Tag %s already exists!" % new_tag)
Exception: Tag openshift-ansible-3.7.6-1 already exists!
++ export status=FAILURE
++ status=FAILURE
+ set +o xtrace
########## FINISHED STAGE: FAILURE: BUILD AN OPENSHIFT-ANSIBLE RELEASE [00h 00m 02s] ##########

sjenning on 10 Nov 2017

@sdodson @michaelgugino @abutcher

smarterclayton on 10 Nov 2017

@smunilla Linking you to this issue.

michaelgugino on 10 Nov 2017

Hopefully fixed now. Branching 3.7 in openshift-ansible yesterday caused this. The normal CD build of 3.7 in the release-3.7 branch created the 3.7.6-1 tag. The openshift-ansible.spec in master was still at 3.7.4-1. I've bumped the master openshift-ansible.spec to 3.8 to avoid this.

jupierce on 11 Nov 2017

Install update is still broken, might be that openshift-ansible tag has the
wrong release

https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/17268/test_pull_request_origin_extended_conformance_install_update/9241/

On Nov 10, 2017, at 6:04 PM, Justin Pierce notifications@github.com wrote:

Hopefully fixed now. Branching 3.7 in openshift-ansible yesterday caused
this. The normal CD build of 3.7 in the release-3.7 branch created the
3.7.6-1 tag. The openshift-ansible.spec in master was still at 3.7.4-1.
I've bumped the master openshift-ansible.spec to 3.8 to avoid this.

—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
https://github.com/openshift/origin/issues/17263#issuecomment-343609960,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_p26o9Kg4-mWnKHyT-CuteSSWQUyXks5s1NZdgaJpZM4QZ5YZ
.

smarterclayton on 11 Nov 2017

Still broken on the next step:

######### STARTING STAGE: INSTALL ANSIBLE ATOMIC-OPENSHIFT-UTILS ##########
+ [[ -s /var/lib/jenkins/jobs/test_pull_request_origin_extended_conformance_install_update/workspace/activate ]]
+ source /var/lib/jenkins/jobs/test_pull_request_origin_extended_conformance_install_update/workspace/activate
++ export VIRTUAL_ENV=/var/lib/jenkins/origin-ci-tool/b4433dfdce6a5fba26d100d9416d78fd95716382
++ VIRTUAL_ENV=/var/lib/jenkins/origin-ci-tool/b4433dfdce6a5fba26d100d9416d78fd95716382
++ export PATH=/var/lib/jenkins/origin-ci-tool/b4433dfdce6a5fba26d100d9416d78fd95716382/bin:/sbin:/usr/sbin:/bin:/usr/bin
++ PATH=/var/lib/jenkins/origin-ci-tool/b4433dfdce6a5fba26d100d9416d78fd95716382/bin:/sbin:/usr/sbin:/bin:/usr/bin
++ unset PYTHON_HOME
++ export OCT_CONFIG_HOME=/var/lib/jenkins/jobs/test_pull_request_origin_extended_conformance_install_update/workspace/.config
++ OCT_CONFIG_HOME=/var/lib/jenkins/jobs/test_pull_request_origin_extended_conformance_install_update/workspace/.config
++ mktemp
+ script=/tmp/tmp.KroIrSOmRy
+ cat
+ chmod +x /tmp/tmp.KroIrSOmRy
+ scp -F ./.config/origin-ci-tool/inventory/.ssh_config /tmp/tmp.KroIrSOmRy openshiftdevel:/tmp/tmp.KroIrSOmRy
+ ssh -F ./.config/origin-ci-tool/inventory/.ssh_config -t openshiftdevel 'bash -l -c "timeout 600 /tmp/tmp.KroIrSOmRy"'
+ cd /data/src/github.com/openshift/aos-cd-jobs
+ pkg_name=origin
+ echo origin
+ echo 'openshift-ansible openshift-ansible-callback-plugins openshift-ansible-docs openshift-ansible-filter-plugins openshift-ansible-lookup-plugins openshift-ansible-playbooks openshift-ansible-roles'
++ cat OPENSHIFT_ANSIBLE_BUILT_VERSION
+ sudo python sjb/hack/determine_install_upgrade_version.py atomic-openshift-utils-3.8.1-1.git.0.d2eee04.el7.noarch --dependency_branch master
[ERROR] Can not determine install and upgrade version for the `atomic-openshift-utils` package
++ export status=FAILURE
++ status=FAILURE
+ set +o xtrace
########## FINISHED STAGE: FAILURE: INSTALL ANSIBLE ATOMIC-OPENSHIFT-UTILS [00h 01m 01s] ##########

simo5 on 11 Nov 2017

+ sudo python sjb/hack/determine_install_upgrade_version.py atomic-openshift-utils-3.8.1-1.git.0.d2eee04.el7.noarch --dependency_branch master
[ERROR] Can not determine install and upgrade version for the `atomic-openshift-utils` package

kargakis on 13 Nov 2017

Adding 3.7 repos to the test AMIs in https://github.com/openshift/origin-ci-tool/commit/2453a176a96088fff6a596fb061e8f11e54e43a4

Will need to build a new base AMI... which is currently blocked. Will figure out what we need to do there.

stevekuznetsov on 13 Nov 2017

😄1

New base AMIs blocked on https://github.com/openshift/origin/pull/17268

stevekuznetsov on 13 Nov 2017

FYI: The issue with the determine_install_upgrade_version.py is that we are building o-a with 3.8 tag

....       
atomic-openshift-utils.noarch                 3.5.145-1.git.0.e1e330f.el7                                                  rhel-7-server-ose-3.5-rpms            
atomic-openshift-utils.noarch                 3.6.8-1.git.0.8e26f8c.el7                                                    centos-paas-sig-openshift-origin36-rpms
atomic-openshift-utils.noarch                 3.6.153-1.el7                                                                centos-paas-sig-openshift-origin36-rpms
atomic-openshift-utils.noarch                 3.6.173.0.3-1.el7                                                            centos-paas-sig-openshift-origin36-rpms
atomic-openshift-utils.noarch                 3.6.173.0.60-1.el7                                                           centos-paas-sig-openshift-origin36-rpms
atomic-openshift-utils.noarch                 3.6.173.0.71-1.git.0.23eee8c.el7                                             rhel-7-server-ose-3.6-rpms            
atomic-openshift-utils.noarch                 3.8.1-1.git.0.0ef0a1e.el7                                                    openshift-ansible-local-release

so we are missing the 3.7 which would be installed. That's why the script fails

jhadvig on 13 Nov 2017

Right -- adding the repos for 3.7 in the commit above adds the 3.7 version that we need to be present

stevekuznetsov on 13 Nov 2017

👍1

Adding 3.7 repos to the test AMIs in openshift/origin-ci-tool@2453a17

Is there any documentation about places that need to change once we cut a new branch?

kargakis on 13 Nov 2017

@kargakis unfortunately no. We had a card on our board to tackle it, but there are a lot of moving parts

stevekuznetsov on 13 Nov 2017

That particular failure should be fixed now.

https://ci.openshift.redhat.com/jenkins/job/test_pull_request_openshift_ansible_extended_conformance_install_update/3467 at least got past installing the 3.7 installer as part of master install_upgrade

sdodson on 13 Nov 2017

It appears we get farther now. New fail point is

INSTALLER STATUS ***************************************************************
Initialization             : Complete
Health Check               : In Progress
    This phase can be restarted by running: playbooks/byo/openshift-checks/pre-install.yml



Failure summary:


  1. Hosts:    localhost
     Play:     OpenShift Health Checks
     Task:     Run health checks (install) - EL
     Message:  One or more checks failed
     Details:  check "docker_image_availability":
               One or more required container images are not available:
                   openshift/origin-deployer:v3.7.0,
                   openshift/origin-docker-registry:v3.7.0,
                   openshift/origin-haproxy-router:v3.7.0,
                   openshift/origin-pod:v3.7.0
               Checked with: skopeo inspect [--tls-verify=false] [--creds=<user>:<pass>] docker://<registry>/<image>
               Default registries searched: docker.io


The execution of "/usr/share/ansible/openshift-ansible/playbooks/byo/config.yml" includes checks designed to fail early if the requirements of the playbook are not met. One or more of these checks failed. To disregard these results,explicitly disable checks by setting an Ansible variable:
   openshift_disable_check=docker_image_availability
Failing check names are shown in the failure details above. Some checks may be configurable by variables if your requirements are different from the defaults; consult check documentation.
Variables can be set in the inventory or passed on the command line using the -e flag to ansible-playbook.
++ export status=FAILURE
++ status=FAILURE

sjenning on 14 Nov 2017

@smarterclayton please confirm but my understanding is that this job will remain broken until we _actually_ cut a 3.7 release and push out the images, etc

stevekuznetsov on 14 Nov 2017

That's my understanding of the situation as well.

sdodson on 14 Nov 2017

Lowering priority as it is not a queue blocker

stevekuznetsov on 14 Nov 2017

This has flaked three times in a row for https://github.com/openshift/origin/pull/16538 that updates just /example directory.