Enhancements: Improve the multi-platform compability

Created on 30 Apr 2017  路  61Comments  路  Source: kubernetes/enhancements

Kubernetes on multiple platforms

  • One-line feature description (can be used as a release note): Kubernetes should work on the platforms the community expects it to work on. Automated CI e2e tests should be run for all supported architectures. It should be possible to run clusters with nodes of mixed architectures.
  • Primary contact (assignee): @luxas @mkumatag @ixdy
  • Responsible SIGs: no formal sig / sig-release will probably be the closest

    • Mostly this is about making our release/test tooling work on multiple platforms and providing a deployment solution that works everywhere: kubeadm

  • Design proposal link (community repo): https://github.com/kubernetes/community/blob/master/contributors/design-proposals/multi-platform.md
  • Reviewer(s) - (for LGTM) recommend having 2+ reviewers (at least one from code-area OWNERS file) agreed to review. Reviewers from multiple companies preferred: @ixdy @luxas @vishh
  • Approver (likely from SIG/area to which feature belongs): @ixdy @luxas @vishh @thockin @brendandburns
  • Feature target (which target equals to which milestone):

    • Alpha release target (x.y): v1.3/v1.4

    • Beta release target (x.y): v1.12?

    • Stable release target (x.y): TBD

It's very hard to estimate alpha/beta/stable levels of this feature as there is no clear graduation path.
Anyway, I'd consider this feature "done" when:

  • We have automated CI e2e Conformance tests running for arm, arm64, ppc64le, s390x and windows continously. Currently we're lacking the hardware support. I'm in touch with CNCF to fix this eventually. Also the e2e suite should be
  • Manifest lists are pushed to the same image tag is usable on any platform without having to substitute -ARCH in order to get the right docker image downloaded.
  • All Kubernetes core and Incubated projects push artifacts for all supported platforms
  • QEMU inside of the multiarch containers isn't needed anymore. When building docker images for a foreign architecture, we're using QEMU for the syscall emulation, which currently requires the QEMU binary to be in the resulting image. However, with a new 4.8 kernel it can be done without any extra addition to the resulting image.

KubeCon video where I'm talking about this: https://www.youtube.com/watch?v=ZdzKQwMjg2w

kinfeature sirelease stagbeta trackeno

Most helpful comment

GCR now supports manifests, so this should be able to move forwards again.

All 61 comments

Sounds good to me..

Labeled as "help wanted" - SIG has to be defined.

@idvoretskyi As soon as SIG Release is formally formed as a group, I expect the work to belong to that SIG. Also crosses some borders with the cluster lifecycle SIG.

@calebamiles @pwittrock is SIG-release going to work on this feature?

@idvoretskyi To clarify, so far I and a couple of other persons have been hacking on this with no formal SIG owning it.
This is basically a thing that touches sig-cluster-lifecycle, sig-testing, sig-release and sig-windows all at the same time.

However, I'll talk to @ixdy @vishh @thockin and sig-release where we should formally place this

@luxas it's amazing to see this work in progress!

At the same time, the feature has to have one or multiple SIG's, that are formally responsible for its implementation.

@luxas FYI - I've created a sig/release label and labeled this issue with it.

@idvoretskyi Perfect, waited for that label!
As you saw, I removed it from the v1.7 as it won't make it due to external dependencies (gcr.io, etc.).

I also removed it from Action Required as no actions has to be made for this, targeting v1.8 now

This depends on the GCR team adding support for manifest lists. Adding to the v1.8 milestone for now.
I'm not sure the dependency will be resolved before code freeze (I have waited a year for this to happen already), but in case it does we should track this.

Is GCR support for manifest list images an absolute requirement for this? Could docker hub or another public registry that supports manifest list images (if there are any) be used until GCR supports manifest list images?

Is GCR support for manifest list images an absolute requirement for this?

Yes, as gcr.io is the repo we're using for kubernetes release artifacts.

Could docker hub or another public registry that supports manifest list images (if there are any) be used until GCR supports manifest list images?

No unfortunately not; as that would be a huge, backwards-incompatible change and would result in lots of other complications.

I'm told the GCR team is still working on this :smile:; but due to that it is not ready yet, this stays in alpha :(
On the good side; significant improvements have been made during the cycle to make sure all Conformance tests (an other requirement for beta) are now passing on the other platforms :+1:

@luxas @kubernetes/sig-cluster-lifecycle-feature-requests can you confirm that this feature targets 1.8?

If yes, please, update the features tracking spreadsheet with the feature data, otherwise, let's remove this item from 1.8 milestone.

Thanks

@idvoretskyi It targets v1.8, and I updated the spreadsheet

@luxas :wave: Please indicate in the 1.9 feature tracking board
whether this feature needs documentation. If yes, please open a PR and add a link to the tracking spreadsheet. Thanks in advance!

No new docs needed here, still waiting on GCR supporting manifest lists, so this feature can't graduate to beta yet.

GCR now supports manifests, so this should be able to move forwards again.

Does this project belong to sig-cluster-lifecycle or sig-release or both?

@luxas @kubernetes/sig-cluster-lifecycle-feature-requests do you have any further plans (esp. for v1.10) for this feature?

@luxas @kubernetes/sig-cluster-lifecycle-feature-requests Any plans for this in 1.11?

If so, can you please ensure the feature is up-to-date with the appropriate:

  • Description
  • Milestone
  • Assignee(s)
  • Labels:

    • stage/{alpha,beta,stable}

    • sig/*

    • kind/feature

cc @idvoretskyi

This won't move to beta yet, targeting v1.12

/stage beta
/milestone v1.12

@luxas @mkumatag @ixdy --

It looks like this feature is currently in the Kubernetes 1.12 Milestone.

If that is still accurate, please ensure that this issue is up-to-date with ALL of the following information:

  • One-line feature description (can be used as a release note):
  • Primary contact (assignee):
  • Responsible SIGs:
  • Design proposal link (community repo):
  • Link to e2e and/or unit tests:
  • Reviewer(s) - (for LGTM) recommend having 2+ reviewers (at least one from code-area OWNERS file) agreed to review. Reviewers from multiple companies preferred:
  • Approver (likely from SIG/area to which feature belongs):
  • Feature target (which target equals to which milestone):

    • Alpha release target (x.y)

    • Beta release target (x.y)

    • Stable release target (x.y)

Set the following:

  • Description
  • Assignee(s)
  • Labels:

    • stage/{alpha,beta,stable}

    • sig/*

    • kind/feature

Once this feature is appropriately updated, please explicitly ping @justaugustus, @kacole2, @robertsandoval, @rajendar38 to note that it is ready to be included in the Features Tracking Spreadsheet for Kubernetes 1.12.


Please note that the Features Freeze is July 31st, after which any incomplete Feature issues will require an Exception request to be accepted into the milestone.

In addition, please be aware of the following relevant deadlines:

  • Docs deadline (open placeholder PRs): 8/21
  • Test case freeze: 8/28

Please make sure all PRs for features have relevant release notes included as well.

Happy shipping!

/cc @timothysc ^

@luxas @timothysc @neolit123 --
Feature Freeze is today. Are we planning on graduating this feature in Kubernetes 1.12?
If so, can you make sure everything is up-to-date, so I can include it on the 1.12 Feature tracking spreadsheet?

@justaugustus we should wait on @luxas to comment on the estimate.
from my perspective beta seems OK given the gcr.io bucket nowadays has good arch support for the required images.

also, i would like to remind that while this is a sig-cluster-lifecycle initiative, it's really sig-release who should be owning this feature.
@kubernetes/sig-release-feature-requests

@luxas -- have you had a chance to review this?

Are you certain that @luxas is the only reviewer for this?

/assign @tpepper @timothysc @dims
Please update here when you have a chance.

Majority of the work in building multi-arch images for e2e test images is in, the last item is this:
https://github.com/kubernetes/kubernetes/pull/66984

Once we get this in in kubernetes/release:
https://github.com/kubernetes/release/pull/516

We can switch things on in kubeadm:
https://github.com/kubernetes/kubernetes/pull/66960

Hey there! @luxas I'm the wrangler for the Docs this release. Is there any chance I could have you open up a docs PR against the release-1.12 branch as a placeholder? That gives us more confidence in the feature shipping in this release and gives me something to work with when we start doing reviews/edits. Thanks! If this feature does not require docs, could you please update the features tracking spreadsheet to reflect it?

@dims what's the plan here? Can we justify the k/release move still?

@tpepper i think so. otherwise we may have to unwind code (say in kubeadm that @neolit123 pointed out). which may end up with more CI failures.

@dims how are things looking here now?

And who's handling docs on this?

@tpepper the main docs would be in kubeadm. Stating that folks can drop using -amd64 style extensions for kubernetes deliverables. cc @neolit123 @timothysc

@tpepper @dims
on the kubeadm side we removed the arch suffixes for control-plane images, etc.

$ kubeadm config images list
k8s.gcr.io/kube-apiserver:v1.12.0
k8s.gcr.io/kube-controller-manager:v1.12.0
k8s.gcr.io/kube-scheduler:v1.12.0
k8s.gcr.io/kube-proxy:v1.12.0
k8s.gcr.io/pause:3.1
k8s.gcr.io/etcd:3.2.24
k8s.gcr.io/coredns:1.2.2

And who's handling docs on this?

as discussed with you (@tpepper was it on private?) - no docs, no announcements.

  • no docs - this is a transparent feature for the users, at this point and should "just work" (once we have all manifest lists in place)
  • no announcements - most CNI plugins are not ready for this

@neolit123 Based on your comment 鈽濓笍 I'm moving this out of my attention for docs. Please let me know if I misunderstood you.

@zparnold seems like the right thing to do. it was already discussed as the right way to proceed for this feature. thanks!

@neolit123 Do we need to revisit this? now that we have kube-dns images have manifests?

@dims
so all the manifest-lists that kubeadm needs for the control plane and addons are in place, which is very nice! thanks to everyone for the hard work!

i will add an e2e for this later this week. i'm just cleaning it up at this point.

"Kubernetes on multiple platforms" is very broad.
in terms of should this be beta or GA eventually, it's hard to judge, because we have kind/ecosystem issues at hand. the only CNI plugin that i know of which has good multi-arch support is weave (but it lacks s390x). flannel is on the way to support all the arches that we need in 0.11.0, but it's not released yet.

commenting on some of the points in the OP.

We have automated CI e2e Conformance tests running for arm, arm64, ppc64le, s390x and windows continously. Currently we're lacking the hardware support. I'm in touch with CNCF to fix this eventually. Also the e2e suite should be

we don't have these yet. possibly some progress can be done in 1.13.
https://k8s-testgrid.appspot.com/sig-cluster-lifecycle-multi-platform

Manifest lists are pushed to the same image tag is usable on any platform without having to substitute -ARCH in order to get the right docker image downloaded.

this is done for kubeadm now. again, most CNIs don't have this.

All Kubernetes core and Incubated projects push artifacts for all supported platforms

for main Kubernetes this is true i'd say at this point, but for the incubated projects - not yet?

i don't have enough details on incubated projects, but possibly a org wide requirement has to happen where all projects should support multi-arch. this could drag the GA status here and might be a good idea to omit the incubated part from this feature.

add this instead "a project can graduate from incubation only if it has multi-arch support".

thanks for the analysis @neolit123

for my part i added this to the 1.12 docs https://github.com/kubernetes/website/pull/10379 with tips on how to build these things

Kubernetes 1.13 is going to be a 'stable' release since the cycle is only 10 weeks. We encourage no big alpha features and only consider adding this feature if you have a high level of confidence it will make code slush by 11/09. Are there plans for this enhancement to graduate to alpha/beta/stable within the 1.13 release cycle? If not, can you please remove it from the 1.12 milestone or add it to 1.13?

We are also now encouraging that every new enhancement aligns with a KEP. If a KEP has been created, please link to it in the original post. Please take the opportunity to develop a KEP

https://github.com/kubernetes/community/blob/master/contributors/design-proposals/multi-platform.md was written before KEPs were a thing, and probably needs to be updated, but that might be a good starting point.

@luxas @mkumatag @ixdy I'm following up from @claurence's post if there any plans for this to graduate in 1.13?

given the shortened 1.13 release cycle, i would say that only some work for missing e2e tests could be done in 1.13, maybe CNI could see improvements, but that's a 3rd party dependency.

RE: incubated projects: IMHO, a separate KEP should be sent for this and we should omit the incubated project demand from this feature as outlined here.

/milestone clear

At this point, this really should not be owned by sig-cluster-lifecycle but should be passed onto @kubernetes/sig-release-feature-requests or the k8s-infra-team.

@timothysc sig-testing may be another option as the main thing lacking right now is automated CI e2e Conformance tests running for arm, arm64, ppc64le, s390x and windows (well, we do have ppc64le, but others are missing)

@dims I don't think it should be on sig-testings shoulders to manage things like s390x when they don't have the gear to debug. Ideally, I think the consumers should supply signal via federated testing.

/unassign @timothysc

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@dims - is there a concise description of the hardware and software needs for "federated testing" as described by @timothysc above?

I see this has gone stale again, but I don't think the need has passed for it!

@vielmetti yes!. what he meant was that interested parties should set up their own CI using whatever tools they deem fit BUT publish the results to the community test grid, so any consumers or say the release team can check if any recent changes have broken functionality in other platforms.

Here's a HOWTO post results to the community test grid:
https://github.com/kubernetes/test-infra/tree/master/testgrid/conformance

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

/lifecycle frozen

/lifecycle frozen

Enhancement issues opened in kubernetes/enhancements should never be marked as frozen.
Enhancement Owners can ensure that enhancements stay fresh by consistently updating their states across release cycles.

/remove-lifecycle frozen

Hello @neolit123 @luxas , I'm the Enhancement Lead for 1.15. Is this feature going to be graduating alpha/beta/stable stages in 1.15? Please let me know so it can be tracked properly and added to the spreadsheet. A KEP will need to be merged before this can be included in 1.15 as well.

Once coding begins, please list all relevant k/k PRs in this issue so they can be tracked properly.

/close

we have separate issues for different platforms already logged. let's close this one out

@dims: Closing this issue.

In response to this:

/close

we have separate issues for different platforms already logged. let's close this one out

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dims thanks for taking the action.

we have separate issues for different platforms already logged. let's close this one out

+1

Was this page helpful?
0 / 5 - 0 ratings

Related issues

xing-yang picture xing-yang  路  13Comments

justaugustus picture justaugustus  路  3Comments

dekkagaijin picture dekkagaijin  路  9Comments

robscott picture robscott  路  11Comments

mitar picture mitar  路  8Comments