Sig-release: [Umbrella] Considerations around a mirror registry

Created on 2 Dec 2020  Â·  15Comments  Â·  Source: kubernetes/sig-release

What would you like to be added:

Mirroring for popular non-Kubernetes community images that are used across the project.

Why is this needed:

Keying off of some recent discussions, specifically around Docker Hub changes, some community members have requested mirrored images or that we build our own for critical release and testing components.

Some recent discussions include:

I've created a staging repository for this in https://github.com/kubernetes/k8s.io/pull/1441, but how to approach implementation is undecided at the moment.

For now, I'd like to collect some feedback on what contributor requirements are before we do anything else.

cc: @kubernetes/sig-release @kubernetes/k8s-infra-team @kubernetes/sig-testing

ref: https://kubernetes.slack.com/archives/CCK68P2Q2/p1605906060178100, https://github.com/kubernetes/kubernetes/pull/95567, https://github.com/kubernetes/test-infra/issues/19477, https://www.docker.com/pricing/resource-consumption-updates, https://cloud.google.com/container-registry/docs/pulling-cached-images

kinfeature needs-priority sirelease

Most helpful comment

Another use case that isn’t mentioned here exactly - downstream distros sometimes can’t trust upstream community images (or needs a process to rebuild them years or decades later), so being able to control the source of images as part of a unified process is also desirable, which requires mirroring and an audit list of every image used.

All 15 comments

I think @spiffxp filed a similar issue the other day.

Ideally I'd like to move things to primary hosting on a registry we're comfortable with, instead of configuring mirrors (this may be what you meant already, but I want to be clear on that point).

e.g. for e2e.test we want to get all of the images we use into k8s.gcr.io, even if that means just copying them over and updating the references. cc @claudiubelu @wilsonehusin

I think @spiffxp filed a similar issue the other day.

Very likely; just wanted to tie a bunch of threads together in a SIG Release context, so RelEng can start chunking the work in the new year.

Ideally I'd like to move things to _primary_ hosting on a registry we're comfortable with, instead of configuring mirrors (this may be what you meant already, but I want to be clear on that point).

e.g. for e2e.test we want to get all of the images we use into k8s.gcr.io, even if that means just copying them over and updating the references.

Agreed. I think there will likely be a few categories:

  • images that we push, but to Google Infra and need to transfer them over to the Community (out of scope here, but in scope for --> https://github.com/kubernetes/k8s.io/issues/1458)
  • images that are further "upstream" e.g., Docker Hub / Quay that we need to be resilient to pull failures of (candidates for mirroring)
  • images that we depend on, but could maybe eventually build ourselves? e.g., Golang (mirror first, then create our own)

@hakman pointed out to me that containerd doesn't release ARM artifacts (https://github.com/containerd/containerd/releases/tag/v1.3.9) and AIUI points users to docker for ARM builds. I can see a lot of things being in the 3rd category ("we build it to normalize it").

@justinsb -- absolutely! I think another great example along the arch side is distroless: https://github.com/GoogleContainerTools/distroless/issues/583.

When multi-arch images were introduced, they were only for arm64, which broke us.
@dims and @mattmoor did some great work to get us back into a good state.

Not necessarily saying that distroless is on the list of images needing to be mirrored, but it is an example of an assumption causing a break in our workflow.

Sent a note to k-dev to get additional feedback here: https://groups.google.com/g/kubernetes-dev/c/198cwXYDtjc

An OCI image index seems like a nice idea: “I don't care where you get it from, but the manifest had better have exactly _this hard-to-collide-with checksum_”.
So, I'd like an approach that works for now and leaves room for more improvements.

Great topic - I’ll add https://github.com/kubernetes/kubernetes/pull/93510 which is what openshift is now using to support offlining all images used by a test framework for disconnected users.

I am very incentivized to work with folks to make this easier for everyone.

Another use case that isn’t mentioned here exactly - downstream distros sometimes can’t trust upstream community images (or needs a process to rebuild them years or decades later), so being able to control the source of images as part of a unified process is also desirable, which requires mirroring and an audit list of every image used.

downstream distros sometimes can’t trust upstream community images (or needs a process to rebuild them years or decades later), so being able to control the source of images as part of a unified process is also desirable, which requires mirroring and an audit list of every image used.

Love this, @smarterclayton!
Artifact management is high on my priority list for 2021 and I think as we continue to build a cohesive story around it, authorization/attestation definitely comes into play.

Loosely connected to the problem of isolating from upstream issues, but I recently had to solve the problem of reducing the bandwidth usage of the pulls all around clusters, every time an image moves or a daemonset is deployed. This also allows for faster restart delays since the blobs are locally cached. Both those issues were important for my customers.

That bandwidth usage is mainly the blobs so I wrote a pull-through cache recording those blobs and serving them directly. Supports parallel pulls of the same blob, and peer-checking to avoid hitting the upstream while allowing high availability.

https://github.com/mcluseau/docker-registries-mirror

As it is referenced as a "mirror" in containerd terminology, it may or may not be a use-case here ^^

WRT "be able to rebuild years later" - This roughly means forking every source repo for every image we depend on transitively ad infinitum, right? Otherwise we simply do not have that ability. Github repos DO disappear occasionally. I made the same argument about godeps a long time back, and the discussion swirled on how much energy it takes to maintain such a beast.

I do think that we, as a project used by a lot of people, owe our users our best efforts around sanity here. Having deps on images we didn't build is bad. Having deps on infra we don't control is worse.

That said, our project pays for this stuff and serving billions of image pulls "ain't cheap". So we need to be careful not to position our "mirror" as a free (as in beer) dockerhub. It needs to be scoped to JUST things we _need_ for the project to operate (tests, etc) and for it to be installed - a "standard" k8s install should not need to touch dockerhub, but if a user puts their own images there, that's on them.

I do not think we have funding, staffing, or mandate to operate a free "mirror" of any significant fraction of dockerhub.

@justaugustus - Is this feature specific addressing e2e/test/Other specific identifiable pull and run or wherever possible but cannot avoid significant fraction for dockerHub as @thockin mentioned?

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

/remove-lifecycle stale
/cc @hh

Was this page helpful?
0 / 5 - 0 ratings

Related issues

justaugustus picture justaugustus  Â·  6Comments

saschagrunert picture saschagrunert  Â·  6Comments

daminisatya picture daminisatya  Â·  8Comments

Bubblemelon picture Bubblemelon  Â·  6Comments

Bubblemelon picture Bubblemelon  Â·  6Comments