When building the openshift/node docker image:
--- openshift/node ---
Sending build context to Docker daemon 18.94 kB
Sending build context to Docker daemon 18.94 kB
Step 1 : FROM openshift/origin
---> 6e39b07a70f9
Step 2 : MAINTAINER Devan Goodwin <[email protected]>
---> Running in 73d875dde0c7
---> 4837e7ea6aff
Removing intermediate container 73d875dde0c7
Step 3 : ADD https://copr.fedoraproject.org/coprs/maxamillion/origin-next/repo/epel-7/maxamillion-origin-next-epel-7.repo /etc/yum.repos.d/
---> 14cab38052e9
Removing intermediate container 9eb5cda1db50
Step 4 : RUN INSTALL_PKGS="libmnl libnetfilter_conntrack openvswitch libnfnetlink iptables iproute bridge-utils procps-ng ethtool socat openssl binutils xz kmod-libs kmod sysvinit-tools device-mapper-libs dbus ceph-common iscsi-initiator-utils" && yum install -y $INSTALL_PKGS && rpm -V $INSTALL_PKGS && yum clean all
---> Running in 5dfb4fa924e4
Loaded plugins: fastestmirror, ovl
http://mirror.cc.columbia.edu/pub/linux/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
...
...
One of the configured repositories failed (Extra Packages for Enterprise Linux 7 - x86_64),
and yum doesn't have enough cached data to continue. At this point the only
safe thing yum can do is fail. There are a few ways to work "fix" this:
1. Contact the upstream for the repository and get them to fix the problem.
2. Reconfigure the baseurl/etc. for the repository, to point to a working
upstream. This is most often useful if you are using a newer
distribution release than is supported by the repository (and the
packages for the previous distribution release still work).
3. Disable the repository, so yum won't use it by default. Yum will then
just ignore the repository until you permanently enable it again or use
--enablerepo for temporary usage:
yum-config-manager --disable epel
4. Configure the failing repository to be skipped, if it is unavailable.
Note that yum will try to contact the repo. when it runs most commands,
so will have to try and fail each time (and thus. yum will be be much
slower). If it is a very temporary problem though, this is often a nice
compromise:
yum-config-manager --save --setopt=epel.skip_if_unavailable=true
failure: repodata/repomd.xml from epel: [Errno 256] No more mirrors to try.
https://ci.openshift.redhat.com/jenkins/job/test_pull_requests_origin_integration/208/console
This has gotten worse ...
http://s3-mirror-us-east-1.fedoraproject.org/pub/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
http://mirror.es.its.nyu.edu/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
http://download-i2.fedoraproject.org/pub/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
http://mirror.cc.columbia.edu/pub/linux/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
http://mirror.cs.pitt.edu/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
https://lug.mtu.edu/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
http://mirror.oss.ou.edu/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
http://ftp.osuosl.org/pub/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
http://ftp.linux.ncsu.edu/pub/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
http://linux.mirrors.es.net/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
https://mirrors.cat.pdx.edu/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
http://kdeforge2.unl.edu/mirrors/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
http://mirror.cs.princeton.edu/pub/mirrors/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
http://mirror.unl.edu/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
http://mirrors.rit.edu/fedora/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
https://mirror.umd.edu/fedora/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
http://archive.linux.duke.edu/pub/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
http://mirror.math.princeton.edu/pub/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
http://mirror.us.leaseweb.net/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
http://mirror.sfo12.us.leaseweb.net/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
http://fedora-epel.mirror.lstn.net/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
https://mirrors.kernel.org/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
http://mirror.datto.com/fedora/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
https://mirrors.xmission.com/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
http://mirror.cogentco.com/pub/linux/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
http://mirrors.syringanetworks.net/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
http://mirror.nexcess.net/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
http://mirror.metrocast.net/fedora/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
http://reflector.westga.edu/repos/Fedora-EPEL/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
http://mirror.symnds.com/distributions/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
http://mirrors.tummy.com/pub/fedora.redhat.com/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
http://epel.wallawalla.edu/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
http://mirror.prgmr.com/pub/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
https://dl.fedoraproject.org/pub/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
https://muug.ca/mirror/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
http://fedora.westmancom.com/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
https://mirror.cpsc.ucalgary.ca/mirror/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
One of the configured repositories failed (Extra Packages for Enterprise Linux 7 - x86_64),
and yum doesn't have enough cached data to continue. At this point the only
safe thing yum can do is fail. There are a few ways to work "fix" this:
1. Contact the upstream for the repository and get them to fix the problem.
2. Reconfigure the baseurl/etc. for the repository, to point to a working
upstream. This is most often useful if you are using a newer
distribution release than is supported by the repository (and the
packages for the previous distribution release still work).
3. Disable the repository, so yum won't use it by default. Yum will then
just ignore the repository until you permanently enable it again or use
--enablerepo for temporary usage:
yum-config-manager --disable epel
4. Configure the failing repository to be skipped, if it is unavailable.
Note that yum will try to contact the repo. when it runs most commands,
so will have to try and fail each time (and thus. yum will be be much
slower). If it is a very temporary problem though, this is often a nice
compromise:
yum-config-manager --save --setopt=epel.skip_if_unavailable=true
failure: repodata/repomd.xml from epel: [Errno 256] No more mirrors to try.
http://s3-mirror-us-east-1.fedoraproject.org/pub/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
http://mirror.es.its.nyu.edu/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
http://download-i2.fedoraproject.org/pub/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
http://mirror.cc.columbia.edu/pub/linux/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
http://mirror.cs.pitt.edu/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
https://lug.mtu.edu/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
http://mirror.oss.ou.edu/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
http://ftp.osuosl.org/pub/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
http://ftp.linux.ncsu.edu/pub/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
http://linux.mirrors.es.net/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
https://mirrors.cat.pdx.edu/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
http://kdeforge2.unl.edu/mirrors/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
http://mirror.cs.princeton.edu/pub/mirrors/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
http://mirror.unl.edu/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
http://mirrors.rit.edu/fedora/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
https://mirror.umd.edu/fedora/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
http://archive.linux.duke.edu/pub/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
http://mirror.math.princeton.edu/pub/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
http://mirror.us.leaseweb.net/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
http://mirror.sfo12.us.leaseweb.net/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
http://fedora-epel.mirror.lstn.net/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
https://mirrors.kernel.org/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
http://mirror.datto.com/fedora/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
https://mirrors.xmission.com/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
http://mirror.cogentco.com/pub/linux/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
http://mirrors.syringanetworks.net/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
http://mirror.nexcess.net/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
http://mirror.metrocast.net/fedora/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
http://reflector.westga.edu/repos/Fedora-EPEL/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
http://mirror.symnds.com/distributions/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
http://mirrors.tummy.com/pub/fedora.redhat.com/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
http://epel.wallawalla.edu/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
http://mirror.prgmr.com/pub/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
https://dl.fedoraproject.org/pub/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
https://muug.ca/mirror/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
http://fedora.westmancom.com/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
https://mirror.cpsc.ucalgary.ca/mirror/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
@tdawson did you say this was just EPEL mirrors being out of sync with the source? Can we mitigate failures like this in CI?
Upgrading this since we're seeing quite a lot of it.
@danmcp @tdawson is there any possible way for us to latch onto a stable EPEL mirror from RH so our CI doesn't suffer needlessly? Can we cache the RPMs we're actually using somewhere?
Mirrors are certainly possible. @tdawson Do you have any particular recommendations or advice on the best approach?
@stevekuznetsov Can you try adding a yum clean all prior to attempting to use the repos?
It has nothing to do with the yum caches - due to the way the repo is managed updates will _always_ be racy. It would be a lot less racy if 2 copies of the RPM data were cached and the RPMs were sync'd before the repodata.
I'd say use openshifts mirrors, they have epel mirrored and are only updated once a day, so you won't get a weird update halfway through your build.
Instead of installing epel-release, do the following
RUN wget -O /etc/yum.repos.d/epel7.repo https://mirror.openshift.com/mirror/epel/epel7.repo
We can address this by switching to the mirror.openshfit.com EPEL mirror for AMI builds, etc, but we will need oc ex dockerbuild capability to inject secrets (sslclientcert/key) for the mirror into our Docker builds, where we see the majority of these flakes.
This will be blocked on @smarterclayton implementing that secrets feature.
You do not need the certs/key for that area https://mirror.openshift.com/mirror/epel/
Since there is nothing private in that repo, you do not need a certificate to access it.
If you are getting failures when you are accessing them, please let me know.
@tdawson ah, incredible. I assumed I would need them due to some other repos on the domain needing them.
@smarterclayton since our internal mirror doesn't need the secrets, can you see anything wrong with simply disabling the upstream EPEL repo in the RHEL base image that I think we use for all of our builds?
Are these official images?
On Wed, Jun 22, 2016 at 1:08 PM, Steve Kuznetsov [email protected]
wrote:
@tdawson https://github.com/tdawson ah, incredible. I assumed I would
need them due to some other repos on the domain needing them.@smarterclayton https://github.com/smarterclayton since our internal
mirror doesn't need the secrets, can you see anything wrong with simply
disabling the upstream EPEL repo in the RHEL base image that I think we use
for all of our builds?—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/openshift/origin/issues/8571#issuecomment-227811550,
or mute the thread
https://github.com/notifications/unsubscribe/ABG_p2GQ2Q7z9PKhu3WUPQaUYJkfQiEdks5qOWv5gaJpZM4ILscQ
.
@smarterclayton I'm not sure exactly how the official images are built. What I'm concerned about is the images built during a Jenkins test run. Probably the release images are built off of the same AMI, though. We could then add a flag or something to vagrant-openshift to make this configurable? @danmcp WDYT?
@stevekuznetsov An option sounds reasonable.
Ok, so I'm going to fix this by adding a symlink from /var/run/secrets/overrides.repo (or similar) into /etc/yum.repos.d/overrides.repo, and then we'll use the RHEL secrets patch to put overrides.repo into /usr/share/rhel/secrets/overrides.repo. That'll ensure we use the mirror only on our machines.
Thoughts?
Change has been delivered. I looked at the install_rhel7 and install jobs in vagrant openshift - both of them do various things to enable other mirrors. We want epel7 mirror for both RHEL and CentOS, correct, but not Fedora? I would expect us to then set the env var OS_BUILD_IMAGE_ARGS="--mount epel7mirror_path:/etc/yum.repos.d/ci-overrides.repo" for any build running on those base images (it needs to be during the base image runs and during the general CI jobs)
We want epel7 mirror for both RHEL and CentOS, correct, but not Fedora?
Yes, that's correct.
@stevekuznetsov @tdawson the yum is flaking pretty heavily today, is there something we can do with this?
We should make sure we're using the EPEL mount that Clayton developed in all of our builds to only allow our mirror.openshift repo to act as an EPEL source. To be honest I can't quite remember if I had the bandwidth to do that before.
@stevekuznetsov can you elaborate on what that means? What EPEL mount, where do we mount it, and how? it's not just the origin builds that are hitting this, so if there's a mechanism to fix it, i'd like to update the other builds my team owns.
(for that matter is it something that should be baked into our AMIs?)
the merge/test is completely broken now, so bumping prio.
@stevekuznetsov can you elaborate on what that means? What EPEL mount, where do we mount it, and how? it's not just the origin builds that are hitting this, so if there's a mechanism to fix it, i'd like to update the other builds my team owns.
The issue here is as such:
In our AMIs and in all of the containers we build in our CI, we have EPEL present. On CentOS and RHEL, we install it since it is not present by default. When EPEL rolls out a release/update, it takes a while for the hundreds of EPEL mirrors to coalesce around the new versions. When yum or dnf get separate and unique output for repolists, etc, from different mirrors for the same repo, they are unhappy.
What we need to do is:
mirror.openshiftoc ex dockerbuilds, we need to allow for a envar to come in that sets OS_BUILD_IMAGE_ARGS="--mount epel7mirror_path:/etc/yum.repos.d/ci-overrides.repo"The issue here is that whatever we set up as a yum repo override in the containers needs to be transient -- we do _not_ want the containers we ship to have the override present after the builds. I understand Clayton's earlier comment to mean he placed the correct mirror override at /etc/yum.repos.d/ci-overrides.repo in the AMIs. Potentially, we could add the OS_BUILD_IMAGE_ARGS value to /etc/environment on the AMIs so we get it by default always. I'm not sure if Clayton plumbed this value through to all of our uses of oc ex dockerbuild or not. I don't think we will be able to support this on normal docker builds, but we should be trying not to use that anyway.
@stevekuznetsov i went trough our script files and we use that arg (and ex dockerbuild) everywhere, so I think it should be safe to put it into environment.
If we don't want to persist that mirror, we can remove it at the end of the build (but that will require having rm -rf)... or we can add "post-build" into "ex dockerbuild" that will execute a command just before we commit the container as image. @smarterclayton ?
When EPEL rolls out a release/update, it takes a while for the hundreds of EPEL mirrors to coalesce around the new versions. When yum or dnf get separate and unique output for repolists, etc, from different mirrors for the same repo, they are unhappy.
@stevekuznetsov that sounds like something that should fail consistently until things are synced. we're seeing flakes.
@stevekuznetsov that sounds like something that should fail consistently until things are synced. we're seeing flakes.
Unclear. @tdawson can shed more light ... I guess it depends on which mirrors have updated and which have not, the branch-out of the mirror pings could happen to hit only upgraded mirrors, or a mix, where it fails
so I think it should be safe to put it into environment
if we have the override on the AMI, yes, stick it into /etc/environment in the base_ami build
If we don't want to persist that mirror, we can remove it at the end of the build
@smarterclayton I thought the whole point of your changes was to do this automatically, _not_ requiring explicit cleanup
@stevekuznetsov our AMIs do _not_ contain "/etc/yum.repos.d/ci-overrides.repo", so if something is creating it, it's doing it as part of the jenkins job, not baking it into the AMI.
the only epel reference I see in our AMIs is epel.repo itself which points to:
mirrorlist=https://mirrors.fedoraproject.org/metalink?repo=epel-7&arch=$basearch
failovermethod=priority
@stevekuznetsov is this the same failure we're talking about? because it sure looks like it tried all the mirrors and couldn't reach any, not it tried a few mirrors and they weren't in sync w/ each other:
https://ci.openshift.redhat.com/jenkins/view/Image%20Verification/job/push_images_s2i/4452/consoleFull#-19406084205717850ae4b077d9ca921b55
failure: repodata/repomd.xml from epel: [Errno 256] No more mirrors to try.
I don't see a reason not to use our mirror in most cases on the host,
unless we don't want people using them. The alternative is to set it up at
the beginning of the run and before we bake AMIs
On Wed, Oct 12, 2016 at 12:06 PM, Ben Parees [email protected]
wrote:
@stevekuznetsov https://github.com/stevekuznetsov is this the same
failure we're talking about? because it sure looks like it tried all the
mirrors and couldn't reach any, not it tried a few mirrors and they weren't
in sync w/ each other:https://ci.openshift.redhat.com/jenkins/view/Image%
20Verification/job/push_images_s2i/4452/consoleFull#-
19406084205717850ae4b077d9ca921b55failure: repodata/repomd.xml from epel: [Errno 256] No more mirrors to try.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/openshift/origin/issues/8571#issuecomment-253258455,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_p5UvnBZ2NudOgS7Cs-7aEmS7SvTqks5qzQV0gaJpZM4ILscQ
.
@stevekuznetsov this is happening a lot again (or the FCM is too broad):
https://ci.openshift.redhat.com/jenkins/job/test_pull_requests_origin_integration/9190/
Yeah, I don't think anyone ever fixed this and I'm not certain all the bits we needed are there in oc ex dockerbuild to make it happen. This will continue to have resurgences as EPEL rolls out large changes.
We need the bug in docker 1.12 (the race) fixed. Might be a kernel thing
On Dec 7, 2016, at 4:07 PM, Steve Kuznetsov notifications@github.com
wrote:
Yeah, I don't think anyone ever fixed this and I'm not certain all the bits
we needed are there in oc ex dockerbuild to make it happen. This will
continue to have resurgences as EPEL rolls out large changes.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/openshift/origin/issues/8571#issuecomment-265573559,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_pyurIvX-sqGhB6oY9BYP4c2dZ8Krks5rFx_ygaJpZM4ILscQ
.
@smarterclayton have we shared this need with Antonio/Lokesh/Dan?
Updating my prior comment what needs to happen to day for this to be reasonably resolved:
docker-1.12.3 installed on the AMIs.repo files for major repositories created, which list only our mirror .repo overrides on the AMIsoc ex dockerbuild can use the .repo overrides during a buildWhile doing this we should make sure that:
.repo overrides at /etc/yum.repos.dWhile I'm almost certain this will fix the issues, we still don't have a complete understanding of why using EPEL fails at such an incredible rate so often -- @tdawson is this to be expected? Are customers of RHEL/normal people on CentOS and Fedora not experiencing this issue as well?
We've ignored this long enough, bumping to p1 as we should really get this fixed and stop hurting from it.
Since I was ignored the first time, let me repeat: https://github.com/openshift/origin/issues/8571#issuecomment-227513301
Yes, it is fundamentally racy. To make it not racy, you need to mirror it and preserve multiple repodata copies in your mirror. Or change Fedora upstream to do that.
Other future-proofing items:
RHEL for example (via Akamai) doesn't delete repodata or RPMs, only adds new ones and swaps repomd.xml.
Since I was ignored the first time
@cgwalters I don't think you were -- if we are using one and only one mirror to serve RPMs, and we control that mirror, it's content and how updates are rolled out on it, races will be much easier to avoid, no?
RHEL for example (via Akamai) doesn't delete repodata or RPMs,
EPEL definitely does, though
You still need to update that mirror...and if you do, unless you retain multiple repodata copies (i.e. don't use rsync --delete or whatever), you'll still have races between your mirror and clients.
Right, but if we time the update to be a time when no jobs are running for our CI, we don't care about those races. When we have time-critical test jobs running on Sunday at 3AM we can talk about more complicated failover for the repodata servers
I'm not certain we run our mirror as a public EPEL mirror today, but if we do, I imagine that a lagged update strategy will make us not play well with that. I think that removing ourselves from the list of public EPEL mirrors to better seal our CI is a fine tradeoff... if we want to keep the public mirror we can always set up another, private one.
containers based on RHEl & our AMI don't have the .repo overrides at /etc/yum.repos.d
containers we build in general have no trace of the override process left once built
@stevekuznetsov @mfojtik @gabemontero fyi we baked all these into the local "rhel7" images we build here:
https://github.com/openshift/vagrant-openshift/blob/master/lib/vagrant-openshift/action/install_origin_rhel7.rb#L43
those images are used as the base for the rhel versions of the s2i/jenkins/db images that we build and push to the ci.dev.openshift.redhat.com registry. They don't get published so it's probably not a big deal if they contain an inaccessible repository. Also if the rhel images we build there don't contain the working mirrors, then when we go to build the real images on top of the rhel image (which we do w/ docker build), we'll have problems.
tl;dr: the local rhel image we build should probably be left alone.
Also if the rhel images we build there don't contain the working mirrors, then when we go to build the real images on top of the rhel image (which we do w/ docker build), we'll have problems.
Yes, this is why we are going to be injecting the overrides during the build steps and removing them later. A downstream user of these images should not know that in the original build, only a single mirror was used. I don't quite understand what you mean by:
tl;dr: the local rhel image we build should probably be left alone.
I have the changes in master sorted to allow this. But it will require
changes to the jobs to pass the right env var.
On Mon, Dec 12, 2016 at 10:55 AM, Steve Kuznetsov notifications@github.com
wrote:
Also if the rhel images we build there don't contain the working mirrors,
then when we go to build the real images on top of the rhel image (which we
do w/ docker build), we'll have problems.Yes, this is why we are going to be injecting the overrides during the
build steps and removing them later. A downstream user of these images
should not know that in the original build, only a single mirror was used.
I don't quite understand what you mean by:tl;dr: the local rhel image we build should probably be left alone.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/openshift/origin/issues/8571#issuecomment-266467683,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_p9OK-NVa47k0x4RIELK9f-gWWilbks5rHW6OgaJpZM4ILscQ
.
1) Various dots I've connected so far:
the master changes @smarterclayton noted that will allow imagebuilder to be invoked with the -mount option based on the setting of an env var are here. This script function is ultimately called by build-base-images.sh, etc.
The vagrant calls in the devenv_ami job for building images are:
vagrant build-origin-base-images
vagrant build-origin --images
vagrant build-origin-base-images calls build-base-images.sh here. The setting of OS_BUILD_IMAGE_ARGS to include the -mount <src yum override for EPEL file location>:<destination yum override for EPEL file location that will not be present in the final output image,> would presumably go in that portion of the ruby file.
vagrant build-origin --images calls make release here. Same note re: OS_BUILD_IMAGE_ARGS.
Per guidance from @stevekuznetsov, the yum override file that redirects where our build fetches EPEL, etc. should be stored in the https://github.com/openshift/vagrant-openshift/tree/master/lib/vagrant-openshift/resources directory
Per @bparees 's earlier comment, there is this snippet in the vagrant-openshift plugin's install-origin action ... my understanding (which @stevekuznetsov confirmed for me) is that the new override file change will correct any repo file currently present that points to the "public" EPEL repo we want to replace with our private one..... does that sound right?
2) things that still need to occur or I don't know how to do yet
I was not finding imagebuilder in our ami's ... @stevekuznetsov confirmed for me that it was in there for a bit and then pulled back out; assuming it needs to be reinserted with resolutions employed for whatever caused it to be reverted earlier; assuming re-introducing imagebuilder to the ami's is outside my list of to-do's
don't know what a yum override file should look like in general, or what it should specifically have for the EPEL "bypass" we are talking about .... the closest search hit I got was https://access.redhat.com/solutions/2067953, but it requires a Red Hat subscription. Assuming it is the right path, do we have subscription credentials for internal use, or is there an internal version of this web page? Otherwise, I just need some general pointers on what to do here. I'm thinking it is just another repo file like the other ones in https://github.com/openshift/vagrant-openshift/tree/master/lib/vagrant-openshift/resources directory, but am not 100% confident on that.
You can just pull imagebuilder in the job.
On Jan 10, 2017, at 2:17 PM, Gabe Montero notifications@github.com wrote:
Various dots I've connected so far:
-
the master changes @smarterclayton https://github.com/smarterclayton
noted that will allow imagebuilder to be invoked with the -mount option
based on the setting of an env var are here
https://github.com/openshift/origin/blob/master/hack/common.sh#L816-L827.
This script function is ultimately called by build-base-images.sh, etc.
-
The vagrant calls in the devenv_ami job for building images are:
vagrant build-origin-base-images
vagrant build-origin --images
-
vagrant build-origin-base-images calls build-base-images.sh here
https://github.com/openshift/vagrant-openshift/blob/master/lib/vagrant-openshift/action/build_origin_base_images.rb#L31-L39.
The setting of OS_BUILD_IMAGE_ARGS to include the -mount
presumably go in that portion of the ruby file.
-
vagrant build-origin --images calls make release here
https://github.com/openshift/vagrant-openshift/blob/master/lib/vagrant-openshift/action/build_origin.rb#L31-L35.
Same note re: OS_BUILD_IMAGE_ARGS.
-
Per guidance from @stevekuznetsov https://github.com/stevekuznetsov,
the yum override file that redirects where our build fetches EPEL, etc.
should be stored in the
https://github.com/openshift/vagrant-openshift/tree/master/lib/vagrant-openshift/resources
directory
-
Per @bparees https://github.com/bparees 's earlier comment, there is
this snippet in the vagrant-openshift plugin's install-origin action
https://github.com/openshift/vagrant-openshift/blob/master/lib/vagrant-openshift/action/install_origin_rhel7.rb#L43
... my understanding (which @stevekuznetsov
https://github.com/stevekuznetsov confirmed for me) is that the new
override file change will correct any repo file currently present that
points to the "public" EPEL repo we want to replace with our private
one..... does that sound right?
things that still need to occur or I don't know how to do yet
-
I was not finding imagebuilder in our ami's ... @stevekuznetsov
https://github.com/stevekuznetsov confirmed for me that it was in
there for a bit and then pulled back out; assuming it needs to be
reinserted with resolutions employed for whatever caused it to be reverted
earlier; assuming re-introducing imagebuilder to the ami's is outside my
list of to-do's
-
don't know what a yum override file should look like in general, or what
it should specifically have for the EPEL "bypass" we are talking about ....
the closest search hit I got was
https://access.redhat.com/solutions/2067953, but it requires a Red Hat
subscription. Assuming it is the right path, do we have subscription
credentials for internal use, or is there an internal version of this web
page? Otherwise, I just need some general pointers on what to do here. I'm
thinking it is just another repo file like the other ones in
https://github.com/openshift/vagrant-openshift/tree/master/lib/vagrant-openshift/resources
directory, but am not 100% confident on that.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/openshift/origin/issues/8571#issuecomment-271669844,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_p66oHufuK_cTVyD0kJ_mNcFBrbSuks5rQ9ktgaJpZM4ILscQ
.
OK thx @smarterclayton - I'll just do a go get -u github.com/openshift/imagebuilder/cmd/imagebuilder in the job.
Also, for the override file, I concluded that my guessing was right, and I just needed to create a repo file. Seeing the epel.repo file in the ami's right now, I'll just modify it to point to the local mirror. @stevekuznetsov concurred, and per him and @tdawson, unsettting mirrorlist and setting base url to either
https://mirror.openshift.com/mirror/epel/ or https://mirror.openshift.com/mirror/epel/7/x86_64/.
Aside from experimenting with those 2 urls, I'll experiment with whether we can just mount over the existing epel.repo file, or if we need to do some move/rename swapping of file to that filename.
I don't know that you can safely change epel - I recall hitting something
while using it. Instead, use a unique name and give your repo priority.
On Jan 10, 2017, at 4:40 PM, Gabe Montero notifications@github.com wrote:
OK thx @smarterclayton https://github.com/smarterclayton - I'll just do a go
get -u github.com/openshift/imagebuilder/cmd/imagebuilder in the job.
Also, for the override file, I concluded that my guessing was right, and I
just needed to create a repo file. Seeing the epel.repo file in the ami's
right now, I'll just modify it to point to the local mirror. @stevekuznetsov
https://github.com/stevekuznetsov concurred, and per him and @tdawson
https://github.com/tdawson, unsettting mirrorlist and setting base url to
either
https://mirror.openshift.com/mirror/epel/ or
https://mirror.openshift.com/mirror/epel/7/x86_64/.
Aside from experimenting with those 2 urls, I'll experiment with whether we
can just mount over the existing epel.repo file, or if we need to do some
move/rename swapping of file to that filename.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/openshift/origin/issues/8571#issuecomment-271706386,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_p_nqN1oZBz8ZZeuNA1X7ncWgJkPmks5rQ_rEgaJpZM4ILscQ
.
give your repo priority.
How do we edit /etc/yum.conf without leaving a trace after the build?
give your repo priority.
How do we edit /etc/yum.conf without leaving a trace after the build?
In your repo file you can set priority=NN default priority is 99 so anything lower will have higher priority. You may need to install yum-plugin-priorities
Thanks for the additional tidbits @smarterclayton @sdodson @stevekuznetsov - got enough dots connected now to try the end to end change here. Assuming all goes well I'll report back when I have positive sandbox results and note the vagrant-openshift PR that results. I'll leave the issue open until I can update the devenv_ami job and see it successfully leverage our local epel mirror with the new vagrant-openshift plugin.
Don't edit yum.conf for sure - use the drop in.
On Jan 10, 2017, at 5:38 PM, Gabe Montero notifications@github.com wrote:
Thanks for the additional tidbits @smarterclayton
https://github.com/smarterclayton @sdodson https://github.com/sdodson
@stevekuznetsov https://github.com/stevekuznetsov - got enough dots
connected now to try the end to end change here. Assuming all goes well
I'll report back when I have positive sandbox results and note the
vagrant-openshift PR that results. I'll leave the issue open until I can
update the devenv_ami job and see it successfully leverage our local epel
mirror with the new vagrant-openshift plugin.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/openshift/origin/issues/8571#issuecomment-271720773,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_pw3YTr9nCrEjed4gzGXFke4B33z8ks5rRAhegaJpZM4ILscQ
.
@sdodson - hey, so i've installed yum-plugin-priorites, and I created a /etc/yum.repos.d/local_epel.repo file with these contents (where I copied epel.repo, commented out mirrorlist, added baseurl, and added priority):
[epel]
name=Extra Packages for Enterprise Linux 7 - $basearch
baseurl=https://mirror.openshift.com/mirror/epel/7/x86_64
#mirrorlist=https://mirrors.fedoraproject.org/metalink?repo=epel-7&arch=$basearch
failovermethod=priority
enabled=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7
priority=98
[epel-debuginfo]
name=Extra Packages for Enterprise Linux 7 - $basearch - Debug
baseurl=https://mirror.openshift.com/mirror/epel/7/x86_64/debug
#mirrorlist=https://mirrors.fedoraproject.org/metalink?repo=epel-debug-7&arch=$basearch
failovermethod=priority
enabled=0
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7
gpgcheck=1
priority=98
But when I run sudo yum repolist -v epel I get this:
Repo-id : epel/x86_64
Repo-name : Extra Packages for Enterprise Linux 7 - x86_64
Repo-status : enabled
Repo-revision: 1484186506
Repo-tags : binary-x86_64
Repo-updated : Wed Jan 11 21:55:41 2017
Repo-pkgs : 11,040
Repo-size : 11 G
Repo-metalink: https://mirrors.fedoraproject.org/metalink?repo=epel-7&arch=x86_64
Updated : Wed Jan 11 21:55:41 2017
Repo-baseurl : http://s3-mirror-us-east-1.fedoraproject.org/pub/epel/7/x86_64/ (41 more)
Repo-expire : 9,999,999 second(s) (last: Thu Jan 12 12:05:09 2017)
Filter : read-only:present
Repo-filename: /etc/yum.repos.d/epel.repo
Running just sudo yum repolist does recognize that the epel and epel-debuginfo repositories are listed more than once in the configuration.
Any suggestions?
Give it a different name instead of epel (in the file itself).
On Thu, Jan 12, 2017 at 1:58 PM, Gabe Montero notifications@github.com
wrote:
@sdodson https://github.com/sdodson - hey, so i've installed
yum-plugin-priorites, and I created a /etc/yum.repos.d/local_epel.repo
file with these contents (where I copied epel.repo, commented out
mirrorlist, added baseurl, and added priority):[epel]
name=Extra Packages for Enterprise Linux 7 - $basearch
baseurl=https://mirror.openshift.com/mirror/epel/7/x86_64mirrorlist=https://mirrors.fedoraproject.org/metalink?repo=epel-7&arch=$basearch
failovermethod=priority
enabled=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7
priority=98 https://mirror.openshift.com/mirror/epel/7/x86_64#mirrorlist=https://mirrors.fedoraproject.org/metalink?repo=epel-7&arch=$basearchfailovermethod=priorityenabled=1gpgcheck=1gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7priority=98[epel-debuginfo]
name=Extra Packages for Enterprise Linux 7 - $basearch - Debug
baseurl=https://mirror.openshift.com/mirror/epel/7/x86_64/debugmirrorlist=https://mirrors.fedoraproject.org/metalink?repo=epel-debug-7&arch=$basearch
failovermethod=priority
enabled=0
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7
gpgcheck=1
priority=98 https://mirror.openshift.com/mirror/epel/7/x86_64/debug#mirrorlist=https://mirrors.fedoraproject.org/metalink?repo=epel-debug-7&arch=$basearchfailovermethod=priorityenabled=0gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7gpgcheck=1priority=98But when I run sudo yum repolist -v epel I get this:
Repo-id : epel/x86_64
Repo-name : Extra Packages for Enterprise Linux 7 - x86_64
Repo-status : enabled
Repo-revision: 1484186506
Repo-tags : binary-x86_64
Repo-updated : Wed Jan 11 21:55:41 2017
Repo-pkgs : 11,040
Repo-size : 11 G
Repo-metalink: https://mirrors.fedoraproject.org/metalink?repo=epel-7&arch=x86_64
Updated : Wed Jan 11 21:55:41 2017
Repo-baseurl : http://s3-mirror-us-east-1.fedoraproject.org/pub/epel/7/x86_64/ (41 more)
Repo-expire : 9,999,999 second(s) (last: Thu Jan 12 12:05:09 2017)
Filter : read-only:present
Repo-filename: /etc/yum.repos.d/epel.repoRunning just sudo yum repolist does recognize that the epel and
epel-debuginfo repositories are listed more than once in the configuration.Any suggestions?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/openshift/origin/issues/8571#issuecomment-272250041,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_p0DMEfVc4t70swXgeb6f_imaRYPtks5rRnflgaJpZM4ILscQ
.
@smarterclayton what's the blocker-bug dance we do with flakes like this? Move to 1.6 milestone? @gabemontero should have more time to look at this in the coming days.
Yeah milestone move
So I got a few cycles to mess with this today. I'm currently blocked with what seems to be a shortcomming or unsupported scenario with imagebuilder --mount
The details:
[ec2-user@ip-172-18-5-183 origin]$ ls -la /etc/yum.repos.d/local_epel.repo
-rw-rw-r--. 1 ec2-user ec2-user 1018 Feb 8 14:40 /etc/yum.repos.d/local_epel.repo
[ec2-user@ip-172-18-5-183 origin]$
the arg --mount /etc/yum.repos.d/local_epel.repo:/etc/yum.repos.d/local_epel.repo is getting passed into imagebuilder via the env var
Then, I inspect the contents of /etc/yum.repos.d/local_epel.repo in 2 places
1) I added a few echo and ls -la commands in the RUN portion of https://github.com/openshift/origin/blob/master/images/base/Dockerfile
2) I then did a docker run -it <image id> /bin/bash on the resulting openshift/origin-base image, and then ran ls -la /etc/yum.repos.d
In both cases, /etc/yum.repos.d/local_epel.repo is not the file I asked to be mounted, but a directory (with no files it it).
--mount /etc/yum.repos.d/local_epel.repo:/etc/yum.repos.d the imagebuilder command bombs.@stevekuznetsov did show me something he is working on, but he's doing a --mount where both src and dst are directories.
The -mount will be a transient mount so doing docker run -it on the resulting container should show no trace of the mount point.
Yep forgot to mention that in my ii) but that crossed my mind as well, which was why added the debug in the Dockerfile.
So seemingly 2 issues with imagebuilder.
@smarterclayton stopped by ... there was a race condition at docker 1.12.2 which could cause this. Upgrading to 1.12.5 is crashing docker now, but if I can get docker healthy in the short term I'll give it another go.
Yep, things improved at docker 1.12.5. The mounted file is present as a file and with the correct contents during the docker build.
On the transient mount point, the file is still there, but empty when I exec into the image:
GGM in base Dockerfile listing repo dir
total 48
drwxr-xr-x. 2 root root 252 Feb 8 20:56 .
drwxr-xr-x. 49 root root 4096 Feb 8 20:56 ..
-rw-r--r--. 1 root root 1664 Nov 29 18:12 CentOS-Base.repo
-rw-r--r--. 1 root root 1309 Nov 29 18:12 CentOS-CR.repo
-rw-r--r--. 1 root root 649 Nov 29 18:12 CentOS-Debuginfo.repo
-rw-r--r--. 1 root root 630 Nov 29 18:12 CentOS-Media.repo
-rw-r--r--. 1 root root 1331 Nov 29 18:12 CentOS-Sources.repo
-rw-r--r--. 1 root root 2893 Nov 29 18:12 CentOS-Vault.repo
-rw-r--r--. 1 root root 314 Nov 29 18:12 CentOS-fasttrack.repo
-rw-r--r--. 1 root root 1056 Dec 27 17:37 epel-testing.repo
-rw-r--r--. 1 root root 957 Dec 27 17:37 epel.repo
-rw-rw-r--. 1 root root 1018 Feb 8 19:57 local_epel.repo
GGM done with repo dir ls
--> LABEL io.k8s.display-name="OpenShift Origin Centos 7 Base" io.k8s.description="This is the base image from which all OpenShift Origin images inherit."
--> Committing changes to openshift/origin-base ...
--> Done
[ec2-user@ip-172-18-5-183 base]$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
openshift/origin-base latest 982e2598fbe3 7 seconds ago 274.9 MB
docker.io/centos centos7 67591570dd29 7 weeks ago 191.8 MB
[ec2-user@ip-172-18-5-183 base]$ docker run -it 982e2598fbe3 /bin/bash
[root@7eb8d17f9dff /]# ls -la /etc/yum.repos.d
total 44
drwxr-xr-x. 2 root root 252 Feb 8 20:56 .
drwxr-xr-x. 49 root root 4096 Feb 8 20:56 ..
-rw-r--r--. 1 root root 1664 Nov 29 18:12 CentOS-Base.repo
-rw-r--r--. 1 root root 1309 Nov 29 18:12 CentOS-CR.repo
-rw-r--r--. 1 root root 649 Nov 29 18:12 CentOS-Debuginfo.repo
-rw-r--r--. 1 root root 630 Nov 29 18:12 CentOS-Media.repo
-rw-r--r--. 1 root root 1331 Nov 29 18:12 CentOS-Sources.repo
-rw-r--r--. 1 root root 2893 Nov 29 18:12 CentOS-Vault.repo
-rw-r--r--. 1 root root 314 Nov 29 18:12 CentOS-fasttrack.repo
-rw-r--r--. 1 root root 1056 Dec 27 17:37 epel-testing.repo
-rw-r--r--. 1 root root 957 Dec 27 17:37 epel.repo
-rwxr-xr-x. 1 root root 0 Feb 8 20:55 local_epel.repo
[root@7eb8d17f9dff /]#
Now, next step, I'm still trying to confirm that the epel mirror was actually used / get that working. So far, yum info | grep local-epel is turning up bupkus. yum info | grep epel and yum info | grep extras by comparison turn up a bunch of hits for the repos that various packages were pulled from.
OK, got it to work separating out the install of yum-plugin-priorities and have it precede say the install of epel-release. yum info epel-release then confirms it pulled it from our mirror.
Now, I got a nasty looking 404 the first time I tried to use the mirror, and then subsequent yum installs worked. It gave @stevekuznetsov and I pause, but we are still leaning toward giving it a go.
But with the current vagrant -> oct transition going on, we decided to checkpoint on Monday and see where things are with that transition before moving forward.
Update:
imagebuilder on the ami's when appropriateLet's do the dance...
Slightly different, but still this:
Failure talking to yum: failure: repodata/repomd.xml from centos-paas-sig-openshift-origin13-rpms: [Errno 256] No more mirrors to try.
https://buildlogs.centos.org/centos/7/paas/x86_64/openshift-origin13/repodata/repomd.xml: [Errno 14] curl#52 - \"Empty reply from server\""
Most helpful comment
Ok, so I'm going to fix this by adding a symlink from /var/run/secrets/overrides.repo (or similar) into /etc/yum.repos.d/overrides.repo, and then we'll use the RHEL secrets patch to put overrides.repo into
/usr/share/rhel/secrets/overrides.repo. That'll ensure we use the mirror only on our machines.Thoughts?