Argo: Cannot specify key for input artifact (without full artifact location)

Created on 25 Jun 2020  路  14Comments  路  Source: argoproj/argo

Tested on 2.9.0-rc3

The official example works:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: artifactory-repository-ref-
spec:
  entrypoint: main
  artifactRepositoryRef:
    key: minio
  templates:
    - name: main
      container:
        image: docker/whalesay:latest
        command: [sh, -c]
        args: ["cowsay hello world | tee /tmp/hello_world.txt"]
      outputs:
        artifacts:
          - name: hello_world
            path: /tmp/hello_world.txt

When switching to input it fails:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: artifactory-repository-ref-
spec:
  entrypoint: main
  artifactRepositoryRef:
    key: minio
  templates:
    - name: main
      container:
        image: docker/whalesay:latest
        command: [sh, -c]
        args: ["cowsay hello world | tee /tmp/hello_world.txt"]
      inputs:
        artifacts:
          - name: hello_world
            path: /tmp/hello_world.txt

with

Failed to submit workflow: templates.entrypoint.steps[0].main templates.main-template inputs.artifacts.hello_world was not supplied

Another important point IMO. I don't see any ways to specify the key during workflow creation (the location of your object within S3/bucket). My understanding is that artifactRepositoryRef can be used to setup default repositories and then can be reused within workflows specifying the location of the folder or file we want to use as inputs or outputs. Was that designed for that purpose?

enhancement epiartifacts workaround

Most helpful comment

Do 馃憤 to show interest.

All 14 comments

Can I confirm if this used to work and does not work anymore? Or if it just never seemed to work?

It never worked for me. See also https://github.com/argoproj/argo/issues/2461#issuecomment-643462725

I have tested it using both gcs and s3.

I think you must specify the key within the bucket, as well as bucket, endpoint etc. Not great I agree, but that is how it is today.

So what's the point artifactRepositoryRef if you must replicate all the config?

@hadim I'm going to recategorize this as an "enhancement". We should do more work in this area, and I'd like to asses interest.

Hi, i was looking for the same solution - i need many different artifacts as inputs and was hoping to have s3 parameters to be defined only once.

Do 馃憤 to show interest.

It's was really surprising to me to find out that you cannot specify input artifacts from the default artifact repository. It should be supported by default. I would expect to be able to specify only the key in the default bucket, and then it should work out of the box.

This issue isn't really to do with artifactRepositoryRef, it's not supported at all. I'm going to rename this issue to reflect this.

Kubeflow Pipelines (KFP) try to separate cluster config (like artifact repository) from workflow config. If this feature is supported from argo, we could let each user manage its artifact repository in their own namespace while keeping the workflow definitions shareable.

This is one of the areas KFP hasn't been able to achieve multi-tenancy separation: https://github.com/kubeflow/pipelines/issues/1223#issuecomment-656507073.

Some recent discussion: https://kubeflow.slack.com/archives/CE10KS9M4/p1602516358147900

It's was really surprising to me to find out that you cannot specify input artifacts from the default artifact repository.

Please help me understand, how it's supposed to choose the input artifact from the arepository (out of thousands of artifacts already there) based on this information alone?

      inputs:
        artifacts:
          - name: hello_world
            path: /tmp/hello_world.txt

This does not seem to contain any information that can be used to get an artifact.

One solution that could solve this feature request is to add support for a generic uri field in the artifact. The rest of the artifact repository information can be selected based on the artifact URI schema.

      inputs:
        artifacts:
          - name: hello_world
            path: /tmp/hello_world.txt
            uri: s3://my-bucket/my_key

This would also be a step towards making it possible to pass the artifact URIs using placeholders like {{tasks.some-task.outputs.hello_world.uri}}.

generic uri

How would you support secrets for username + password?

Was this page helpful?
0 / 5 - 0 ratings