Test-infra: Move kubeflow jobs to https://github.com/GoogleCloudPlatform/oss-test-infra/tree/master/prow/prowjobs

Created on 16 Sep 2019  路  23Comments  路  Source: kubernetes/test-infra

@jlewi what do you think about moving the kubeflow jobs over to https://github.com/GoogleCloudPlatform/oss-test-infra/tree/master/prow/prowjobs and prow.gflocks.com?

kincleanup lifecyclfrozen

All 23 comments

@fejta That seems fine with me.

What else would need to change? i.e

Do testgrids move?
Do we use a different GCS bucket for prow artifacts?

@scottilee is this something you could help with?

@fejta How urgent is this on your end?

Not urgent.

Testgrid can stay the same.
Are you still not using pod utils? If so then yes, you'll upload to a different bucket (maybe prow specifies where to upload it?)

@chases2 we should probably set up GCP/oss-test-infra to work like istio -- where we can annotate prowjobs there and have them show up in this testgrid.

@fejta correct we manually upload our files to GCS right now; but we could probably switch to use pod_utils.

@fejta Can you share some more info (e.g., a link to a ticket or document with explanation if available) on why the move from prow.k8s.io to prow.gflocks.com?

Also, would it just be creating a "kubeflow" folder in https://github.com/GoogleCloudPlatform/oss-test-infra/tree/master/prow/prowjobs and moving the files in https://github.com/kubernetes/test-infra/tree/master/config/jobs/kubeflow to there?

Lastly, I'm not familiar with pod-utils. I'm assuming it's this https://github.com/kubernetes/test-infra/tree/master/prow/pod-utils? Any more info anywhere so I can read up on it?

why the move

prow.k8s.io is for kubernetes (or at least CNCF project)
prow.gflocks.com is for public google projects.

Eventually we want to migrate all non-CNCF projects out of prow.k8s.io

And yes, it is ideally
a) creating a GKE cluster to run jobs (provides you with isolation from other jobs)
b) configuring prow to schedule kubeflow jobs into that cluster
b1) also moving any secrets, configmaps, etc that jobs use
c) moving jobs to that prow instance

pod-utils

Let's not worry about this for now, see https://github.com/kubernetes/test-infra/blob/master/prow/jobs.md#pod-utilities for more detail.

Test containers should no longer need to check out repos and/or upload results to GCS. Sidecar containers will do this for you.

@fejta I apologize for the delay on this. I started a PR at https://github.com/GoogleCloudPlatform/oss-test-infra/pull/93, which is probably wrong 馃檮 but it's a start! Let me know what's missing...

  1. In your last comment above you mention "creating a GKE cluster", is that a GKE cluster that the Kubeflow project should create? If so, where do we specify the details for Prow (I don't think I saw any examples in the existing folders in oss-test-infra/prow/prowjobs)?
  2. Once the PR to oss-test-infra is merged I should open a PR to delete the kubeflow folder and the associated YAML at https://github.com/kubernetes/test-infra/tree/master/config/jobs, correct?

@scottilee Kubeflow already has a Kubernetes cluster in project kubeflow-ci which we use for testing purposes. So I believe with the new approach the goal would be to have prow schedule the jobs for Kubeflow on that instance. I'm not sure what we need to do to make that happen. I suspect we need to install some CRs and other infra on our test cluster.

Given that we are getting close to 0.7 we might want to be careful not make an infra changes that could inhibit us releasing on time.

@jlewi can I either get access to the kubeflow-ci project or can you create the test-pods namespace and generate the cluster values according to the directions here https://github.com/GoogleCloudPlatform/oss-test-infra/pull/93#issuecomment-535367394.

If you need access to the CI cluster please join this group.
https://github.com/kubeflow/internal-acls/blob/master/ci-team.members.txt

Lets proceed cautiously in terms of moving our prow infrastructure because we are getting ready to do a release and don't want to disrupt our test infra.

@scottilee opened up kubeflow/testing#475 to track changes to our test infra. I will run mkbuild-cluster as soon as I can.

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

/remove-lifecycle stale

16898 moved the prow jobs onto the kubeflow testing cluster but we are still using the kubernetes instance of prow. So I think this issue should remain open.

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

/lifecycle frozen

I believe the first part was merged:
kubernetes/test-infra#16898
Here's the doc @clarketm put together: https://docs.google.com/document/d/17sA-rRBe30bM034nL353CgrXETfy2vMe2m_g_bB_ILY/edit#

I believe per the doc Kubeflow is now using its own build cluster; i.e. the prow jobs run inside a kubeflow owned cluster.

So I believe the next part of the migration is to move from the CNCF/kubernetes prow control plane to the GCP/kubernetes control plane

/assign @Bobgy
I'll try to push this forward moving to GCP/oss-test-infra, so that Kubeflow maintainers can be added as approvers.

I have coordinated with gcp oss prow team and will start the migration this week.
/cc @chaodaiG @jlewi @chensun

I'll put progress log here.

Add @google-oss-robot as kubeflow org admin: https://github.com/kubeflow/internal-acls/pull/418

UPDATE: there's a permission issue on gcp oss prow side.
We are currently blocked by resolving that first.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

spiffxp picture spiffxp  路  3Comments

BenTheElder picture BenTheElder  路  4Comments

BenTheElder picture BenTheElder  路  4Comments

xiangpengzhao picture xiangpengzhao  路  3Comments

cjwagner picture cjwagner  路  3Comments