Test-infra: Multi arch for test-infra/images

Created on 3 Mar 2020  Â·  15Comments  Â·  Source: kubernetes/test-infra

This issue is for pushing multi-architecture images for test-infra/images, so that the specific jobs can be scheduled by Prow into Arm64 nodes.

  • [ ] kubekins-test
  • [ ] builder, google/cloud-sdk

If a typical k8s-e2e-kind-test prow job was scheduled into an Arm64 k8s cluster, a POD contains at least the following images, so we should make them to support Arm64:

  • [ ] kubekins-e2e
  • [ ] bootstrap
  • [ ] sidecar
  • [ ] initupload
  • [ ] clonerefs
kinfeature lifecyclrotten

Most helpful comment

Hi @BenTheElder @spiffxp @mkumatag
Can we enable prow utility images for multi-arch?
Here are some reasons why we need prow utility images running on multi-arch.

  1. The k8s conformance test running on ARM which you can see in https://testgrid.k8s.io/sig-node-arm64, are triggered by shell script currently. And we want to sync the test flow with kuternetes test-infra by using prow.
  2. Some other projects like kubevirt, kubeflow, they all use prow for their CI lane. We are trying to enable ARM CI testing for these projects.

All 15 comments

cc @BenTheElder @zhlhahaha

This seems quite a bit ahead of ourselves without any plan for Prow on such nodes.

cc @dims @mkumatag

Interesting, I was thinking about this for quite some time. Can we have individual architecture's cluster running anywhere in the word and prow scheduling respective arch's job into those clusters and get things done?

I don't see why we would go to all that trouble over having jobs ssh etc.
To run test clusters on whatever arch.

We barely have people helping maintain the existing infra with relatively
self managing GKE clusters. I can't see us wanting to maintain long running
bespoke clusters on various infra on top of porting and cross building all
of these equally barely maintained CI images ...

On Tue, Mar 3, 2020, 01:57 Manjunath Kumatagi notifications@github.com
wrote:

Interesting, I was thinking about this for quite some time. Can we have
individual architecture's cluster running anywhere in the word and prow
scheduling respective arch's job into those clusters and get things done?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/test-infra/issues/16588?email_source=notifications&email_token=AAHADK6RN26VNV4OZINPEDLRFTICJA5CNFSM4LAGFGNKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENS2RFY#issuecomment-593864855,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAHADK3WM5GOEIYBXMOPB23RFTICJANCNFSM4LAGFGNA
.

This issue is used to track it.
We will test it in our local hybrid CI cluster firstly.
Finally we will update this issue and then decide whether to do this.
Thanks.

Hi, @BenTheElder @mkumatag @dims , here is some update, issues and plans about prow test on ARM. Please feel free to tell me if there is anything not clear. Looking forward your suggestion.

Update:

  • We deployed a prow cluster on x86 and a kubernetes cluster on ARM.
  • enable prow to assign test jobs to ARM k8s cluster
  • Set a presubmit jobs, "pull-kubernetes-conformance-image-test", on ARM k8s cluster, it can be run successfully and all kind-conformance test passed, result has upload to gs

Issues:

  • The kubekins-e2e image for arm platform is build by ourselves.
  • In the prow jobs, we still use bootstrap.py because we are not able to use podutils as the images, bootstrap, sidecar, initupload, clonerefs only support x86 platform.
  • In order to trigger the presubmit jobs, I use my private kubernetes library in github.com.
  • The script "kind-conformance-image-e2e.sh" is only support x86 platform. I submit a patch for mutliarch support, but it is not enough. As the version of kind in the script is 0.6.0, and the binary do not works well on Arm Platform. We use kind that compiled from most resent source code.

Plans:

  • enable kubekins-e2e image build for multi-arch
  • enable bootstrap, sidecar, initupload, clonerefs build for multi-arch
  • looking forward new release of kind

I still don't think we need a prow on ARM at all, we can let a prow job on the normal build cluster SSH / whatever to ARM infra the same as we do when we bring up cluster on AWS / GCP.

Hi @BenTheElder
According to our research, prow cluster is still on x86, and we can create an arm64 build cluster, and let some typical test prow jobs running on Arm64, such as pull-kubernetes-conformance-image-test.
So, in my opinion, we should make some typical prow-jobs related images to support Arm64.

The following figure is Prow's microservice architecture for the test job of 'pull-kubernetes-conformance-image-test' on Arm64.
image

cc @jingzhao123

Hi @BenTheElder @spiffxp @mkumatag
Can we enable prow utility images for multi-arch?
Here are some reasons why we need prow utility images running on multi-arch.

  1. The k8s conformance test running on ARM which you can see in https://testgrid.k8s.io/sig-node-arm64, are triggered by shell script currently. And we want to sync the test flow with kuternetes test-infra by using prow.
  2. Some other projects like kubevirt, kubeflow, they all use prow for their CI lane. We are trying to enable ARM CI testing for these projects.

I still don't think we need a prow on ARM at all, we can let a prow job on the normal build cluster SSH / whatever to ARM infra the same as we do when we bring up cluster on AWS / GCP.

This can definitely work. However consider how much simpler it is to onboard different architectures for opensource projects, if you just pass a new cluster context to prow and put a label on your job.

As far as I can see, it would mostly just be necessary to cross-compile a few plain docker binaries. clonerefs may need a multiarch base image with git (building that with qemu-user-static should be streight forward. We do that in kubevirt too on our amd64 machines). The bootstrap image is in my opinion not really needed. It would be about

initupload
clonerefs

only.

@BenTheElder would you see a way where we could maintain this in this repo? We basically have the same issue with ppc64 too btw.

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

cjwagner picture cjwagner  Â·  3Comments

cjwagner picture cjwagner  Â·  3Comments

spiffxp picture spiffxp  Â·  3Comments

BenTheElder picture BenTheElder  Â·  4Comments

spzala picture spzala  Â·  4Comments