Origin: We're still vetting Docker RPMs ourselves

Created on 25 Jan 2018  路  9Comments  路  Source: openshift/origin

Symptom:

  1. Hosts:    localhost
     Play:     OpenShift Health Checks
     Task:     Run health checks (install) - EL
     Message:  One or more checks failed
     Details:  check "docker_image_availability":
               Some dependencies are required in order to check Docker image availability.
               Unable to install required packages on this host:
                   python-docker-py,
                   skopeo
               Error: Package: 1:skopeo-0.1.26-2.dev.git2e8377a.el7.x86_64 (oso-rhui-rhel-server-extras)
                          Requires: skopeo-containers = 1:0.1.26-2.dev.git2e8377a.el7
                          Installed: 1:skopeo-containers-0.1.27-3.dev.git14245f2.el7.x86_64 (@httpsmirroropenshiftcomenterpriserheldockertestedx8664os)
                              skopeo-containers = 1:0.1.27-3.dev.git14245f2.el7
                          Available: 1:skopeo-containers-0.1.17-0.7.git1f655f3.el7.x86_64 (oso-rhui-rhel-server-extras)
                              skopeo-containers = 1:0.1.17-0.7.git1f655f3.el7
                          Available: 1:skopeo-containers-0.1.17-1.el7.x86_64 (oso-rhui-rhel-server-extras)
                              skopeo-containers = 1:0.1.17-1.el7
                          Available: 1:skopeo-containers-0.1.18-1.el7.x86_64 (oso-rhui-rhel-server-extras)
                              skopeo-containers = 1:0.1.18-1.el7
                          Available: 1:skopeo-containers-0.1.19-1.el7.x86_64 (oso-rhui-rhel-server-extras)
                              skopeo-containers = 1:0.1.19-1.el7
                          Available: 1:skopeo-containers-0.1.20-2.el7.x86_64 (oso-rhui-rhel-server-extras)
                              skopeo-containers = 1:0.1.20-2.el7
                          Available: 1:skopeo-containers-0.1.20-2.1.gite802625.el7.x86_64 (oso-rhui-rhel-server-extras)
                              skopeo-containers = 1:0.1.20-2.1.gite802625.el7
                          Available: 1:skopeo-containers-0.1.23-1.git1bbd87f.el7.x86_64 (oso-rhui-rhel-server-extras)
                              skopeo-containers = 1:0.1.23-1.git1bbd87f.el7
                          Available: 1:skopeo-containers-0.1.24-1.dev.git28d4e08.el7.x86_64 (oso-rhui-rhel-server-extras)
                              skopeo-containers = 1:0.1.24-1.dev.git28d4e08.el7
                          Available: 1:skopeo-containers-0.1.26-2.dev.git2e8377a.el7.x86_64 (oso-rhui-rhel-server-extras)
                              skopeo-containers = 1:0.1.26-2.dev.git2e8377a.el7

The repo has skopeo-0.1.27-3.dev.git14245f2.el7.x86_64.rpm FWIW, no clue how an older skopeo was installed.

aretests lifecyclrotten prioritP0

Most helpful comment

Should be fixed in https://github.com/openshift/aos-cd-jobs/commit/38e99d4269f7d9201b8e0eb0b7076c088c430d7d ?

Basic idea is:

  • we gather full list of necessary RPMs to install bleeding edge Docker
  • when we validate bleeding edge Docker, we publish those in the dockertested repo
  • when we install Docker, we turn that repo on and then off again
  • when we install Docker, it installed skopeo-containers but not skopeo
  • when openshift-ansible went to install skopeo, it couldn't reach the disabled dockertested repo

All 9 comments

Should be fixed in https://github.com/openshift/aos-cd-jobs/commit/38e99d4269f7d9201b8e0eb0b7076c088c430d7d ?

Basic idea is:

  • we gather full list of necessary RPMs to install bleeding edge Docker
  • when we validate bleeding edge Docker, we publish those in the dockertested repo
  • when we install Docker, we turn that repo on and then off again
  • when we install Docker, it installed skopeo-containers but not skopeo
  • when openshift-ansible went to install skopeo, it couldn't reach the disabled dockertested repo

The whole approach to this needs to be 100000% rethought. As an aside, this whole process was a crutch until the Docker team could run Origin e2es as a test before they push out new RPMs. @jtligon did that happen?

Nope it did not 馃え

/assign @jtligon
/unassign

We need to get rid of our Docker testing jobs and just pull from RHEL 7 Next. We have delivered all the bits that are necessary to get Origin conformance to be simple to run.

/unassign @jtligon
/assign @runcom

Basic idea is:

we gather full list of necessary RPMs to install bleeding edge Docker
when we validate bleeding edge Docker, we publish those in the dockertested repo
when we install Docker, we turn that repo on and then off again
when we install Docker, it installed skopeo-containers but not skopeo
when openshift-ansible went to install skopeo, it couldn't reach the disabled dockertested repo

we can definitely rework this as:

  • we submit a change in projectatomic/docker
  • we have a job that builds an RPM for docker and its dependencies (??? @lsm5, on CentOS probably)
  • we validate the RPM above by running the whole origin tests from something like https://gcsweb-ci.svc.ci.openshift.org/gcs/origin-ci-test/logs/test_branch_origin_extended_conformance_gce/2350/artifacts/rpms
  • regardless of flackies, if we get to a point where we can _at least_ install origin correctly, we publish the RPM in the dockertested repo (??? @lsm5 right now we tag in brew I guess so we need to come up with a solution for this)
  • at this point Origin will kick in and at least we should have a pretty stable docker

I'm not sure we can actually do better than that. It's a chicken-egg situation where we could test docker and it works for us with origin and when we hand the rpm out to you guys it breaks. I mean, with what I've propsed above, we should be able to catch at least issues like the one here in this issue. Any other docker issues will be actually catched by the origin CI.

The container team is going to own the jobs needed so we can offload Steve from this.

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Was this page helpful?
0 / 5 - 0 ratings