Pipelines: [doc] Is there any document of the working Artifact Storage List?

Created on 26 Jun 2020  路  10Comments  路  Source: kubeflow/pipelines

Question about artifact Storage List

I search artifact storage list but I cannot found it.
I just want to know following io is available? for example xgboost etc

  • gs://
  • s3://
  • https://
  • http://
  • file://

I am looking around source code of follows but I cannot found the document.

Since I am looling around xgboost sample code, I want to know which storage is supported.
As far as I know, Argo Workflow's artifact storage is used for the pipelines. But I can not found the relation ship with pipeline parameters.

aredocs lifecyclstale statutriaged

All 10 comments

Hello @sakaia

Which artifact storage do you want to use?

KFP backend uses Argo for orchestration and data passing. You can configure any artifact storage Argo supports https://github.com/argoproj/argo/blob/master/docs/configure-artifact-repository.md
Components and pipelines that use data passing features (outputPath and inputPath) will work automatically.

KFP frontend supports artifact previews for S3 and GCS.

There are some components that do not use system-provided data passing support, but instead take URIs to the data and read-write the data themselves. For such components the supports differs by component (for example, components that use tensorflow.io.gfile support gs:// and s3://, but not https://).

We advice the component authors to use the the system-provided data-passing capabilities that work well unless the data is too big. This way, the components are portable and will use the configured artifact repository.

P.S. You can check this simpler set of XGBoost components and the sample pipeline

Thank you for your suggestion. I am pleased to see the argo document.

By the way any "artifactory" documents ?
OSS seems currently an issue

By the way any "artifactory" documents ?
OSS seems currently an issue

Here are the OSS examples: https://github.com/argoproj/argo/pull/1919/files

Is there any setting parameter to access gs:// files?
Since I am not succeed to execute xgboost_training_cm.py on local machine But I can access files via gsutil
From seeing the Error message Google Cloud ID should be supplied to somewhere.

Annotations:    pipelines.kubeflow.org/component_spec:
                  {"description": "Deletes a DataProc cluster.\n", "inputs": [{"description": "Required. The ID of the Google Cloud Platform project that th...
                workflows.argoproj.io/node-message:
                  Error response from daemon: No such container: 5781b8b610c788cd48943aa8257e3e8d7eaa4ae77dfa698322c995a26712b614
                workflows.argoproj.io/node-name: xgboost-trainer-6bqn6.onExit
                workflows.argoproj.io/template:
                  {"name":"dataproc-delete-cluster","inputs":{"parameters":[{"name":"project","value":"{{kfp-project-id}}"}]},"outputs":{},"metadata":{"anno...
Status:         Failed

@sakaia

Yes I think you'll need to manually override {{kfp-project-id}} when submitting the pipeline for execution, using the GCP project ID under which you're going to create/delete the GKE cluster.

Thank you for your comments.

I am using kubeflow pipelines on local machine (not on GKE).
I want to access Google Cloud Storage. (gs://)

From CLI and SDK, I can access gs://.
But I cannot access via container like sample/secret.py, any setting exists on pipelines execution.

@sakaia

If you're running entirely locally then I don't think the xgboost sample is feasible anyway, because it needs to talk to GCP Dataproc service, which needs a GCP project ID.

Thank you for your comments. I understand it depends largely on GCP (like dataproc) after this issue creation.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

Was this page helpful?
0 / 5 - 0 ratings