Pipelines: Are there plans to allow kubeflow pipelines to be created through a manifest file?

Created on 8 Feb 2019  路  9Comments  路  Source: kubeflow/pipelines

Are there any plans to explore the possibility of making kubeflow pipelines a CRD and letting users create pipelines through yaml manifests like argo ?

prioritp2

Most helpful comment

@swiftdiaries, the CLI is incomplete because of an API revamp that we did just before the OSS release. We plan to address this soon. If this is something that you or someone in the community would like to take on, that would be a very useful contribution.

The parts of the CLI that have integration tests (https://github.com/kubeflow/pipelines/blob/master/backend/test/cli_test.go) are complete and tested at each PR commit.

For now, the CLI lets the user specify the Argo yaml of the backing Argo workflow. Soon, we plan to create a new CRD for the "pipeline resource". This "pipeline resource" will encapsulate:

  • The backing orchestration primitive (e.g.: Argo for task-driven workflow)
  • Metadata (e.g.: type of the data for input/output of steps in the workflow, description of the step, etc.).
  • Unlike most K8 resources, this "pipeline resource" gets persisted to a permanent storage DB (MySQL in the current implementation) to record provenance data and etc. (see https://bit.ly/2WhNT3D).

All 9 comments

I think this is a terrific idea - basically, running some form of a CRD "compiler" in the Kubeflow cluster, so you don't have to do a build step locally.

/cc @vicaire @jlewi @paveldournov

Creating pipelines and pipeline components through a yaml manifest - is indeed on the roadmap. Please stay tuned.

/assign @paveldournov

Something like this would be great: https://github.com/kubeflow/pipelines/blob/master/backend/test/cli_test.go#L62

Are there are any docs on this?

@swiftdiaries, the CLI is incomplete because of an API revamp that we did just before the OSS release. We plan to address this soon. If this is something that you or someone in the community would like to take on, that would be a very useful contribution.

The parts of the CLI that have integration tests (https://github.com/kubeflow/pipelines/blob/master/backend/test/cli_test.go) are complete and tested at each PR commit.

For now, the CLI lets the user specify the Argo yaml of the backing Argo workflow. Soon, we plan to create a new CRD for the "pipeline resource". This "pipeline resource" will encapsulate:

  • The backing orchestration primitive (e.g.: Argo for task-driven workflow)
  • Metadata (e.g.: type of the data for input/output of steps in the workflow, description of the step, etc.).
  • Unlike most K8 resources, this "pipeline resource" gets persisted to a permanent storage DB (MySQL in the current implementation) to record provenance data and etc. (see https://bit.ly/2WhNT3D).

To use the CLI in its current state, you can follow these instructions:

Run the following command to install the CLI:

$ go get -u github.com/kubeflow/pipelines/backend/src/cmd/ml

Make sure that $GOPATH/bin is in your PATH environment variable:

$ export PATH=$GOPATH/bin:$PATH

You then just need to configure kubeconfig using kubectl to setup the cluster that the CLI should interact with. Example commands:

$ kubectl config current-context
$ kubectl config get-clusters
$ MY_CLUSTER=
$ kubectl config set-cluster $MY_CLUSTER

From there, the CLI is self documenting. For instance:

$ ml help pipeline
Manage pipelines

Usage:
  ml pipeline [command]

Available Commands:
  create       Create a pipeline
  delete       Delete a pipeline
  get          Display a pipeline
  get-manifest Display the manifest of a pipeline
  list         List all pipelines
  upload       Upload a pipeline

Thank you so much for the reply @vicaire :) I'll open an issue for the CLI, happy to contribute

This is great, thank you.

Closing as a duplicate of https://github.com/kubeflow/pipelines/issues/812

Please reopen if it is not the case.

Was this page helpful?
0 / 5 - 0 ratings