Argo: Output artifact gets deleted even when it resides on a volume mount

Created on 9 Dec 2020 · 6Comments · Source: argoproj/argo

Summary

If we have an output artifact from a mounted PVC or some other Volume mount, it shouldn't be deleted after upload.

The code: https://github.com/argoproj/argo/blob/master/workflow/executor/executor.go#L343-L348
Assumes that the artifact is on the container ethereal storage, but that might not be the case.

We use a volume to share data fast between workflow steps. However, some of the outputs produced on intermediate steps, we want to upload as output artifacts right away to also use elsewhere. We see that Argo removes the files of the output artifacts, so we can't use them on the next steps... which goes against the recommended advice here: https://github.com/argoproj/argo/blob/master/docs/cost-optimisation.md#consider-trying-volume-claim-templates-or-volumes-instead-of-artifacts

Diagnostics

What Kubernetes provider are you using? 1.18

What version of Argo Workflows are you running? 2.9.3

Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.

bug

Source

antoniomo

All 6 comments

Interesting. Could be pretty nasty. We'll discuss.

alexec on 9 Dec 2020

❤1

Perhaps there is no need to delete the artifact files after uploading them.
It's supposed to be an optimization, but I'm not sure it provides benefit.
I also remember there was a bug I've fixed long time ago in cases where the same file was being output as both artifact and parameter.

If we have an output artifact from a mounted PVC or some other Volume mount, it shouldn't be deleted after upload.

This scenario seems pretty strange. If the output data is generated by the component, then it's usually located on the local disk, not the mounted volume.

However, some of the outputs produced on intermediate steps, we want to upload as output artifacts right away to also use elsewhere.

Perhaps it might be better to make uploading explicit and add an upload template. This can be beneficial as your repositories for intermediate data and persistent data might be different.

Ark-kun on 9 Dec 2020

This scenario seems pretty strange. If the output data is generated by the component, then it's usually located on the local disk, not the mounted volume.

It's actually quite common and the recommended way to pass data between steps, without artifact upload and then download. We are talking about "heavy" artifacts, multiple gigabytes.

Perhaps it might be better to make uploading explicit and add an upload template. This can be beneficial as your repositories for intermediate data and persistent data might be different.

As an optimization we typically do that, so that the upload can happen concurrently with other workflow steps (otherwise, the step that creates the output data, doesn't end until the artifact upload is complete, preventing the dependent steps from starting). However this upload template is just a dummy busybox ls or the like, to do an output artifact. We could use a container with AWS CLI and aws s3 cp... but that seems a bit wrong when S3 artifact support is a core Argo feature (not to mention our teams of Argo users will have a hard time understanding why that's necessary).

antoniomo on 9 Dec 2020

@antoniomo. I've created a dev build for you to test. Can you please test argoproj/argoexec:fix-4676?

alexec on 10 Dec 2020

@antoniomo. I've created a dev build for you to test. Can you please test argoproj/argoexec:fix-4676?

Thanks, I can give it a go over the weekend!

antoniomo on 10 Dec 2020

👍1

Hi!

I used this modified example workflow for testing:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: volumes-existing-
spec:
  entrypoint: volumes-existing-example
  volumes:
  # Pass my-existing-volume as an argument to the volumes-existing-example template
  # Same syntax as k8s Pod spec
  - name: workdir
    persistentVolumeClaim:
      claimName: my-existing-volume

  templates:
  - name: volumes-existing-example
    steps:
    - - name: generate
        template: whalesay
    - - name: print
        template: print-message

  - name: whalesay
    container:
      image: docker/whalesay:latest
      command: [sh, -c]
      args: ["echo generating message in volume; cowsay hello world | tee /mnt/vol/hello_world.txt"]
      volumeMounts:
      - name: workdir
        mountPath: /mnt/vol
    outputs:
      artifacts:
        - name: hello
          path: /mnt/vol/hello_world.txt

  - name: print-message
    container:
      image: alpine:latest
      command: [sh, -c]
      args: ["echo getting message from volume; find /mnt/vol; cat /mnt/vol/hello_world.txt"]
      volumeMounts:
      - name: workdir
        mountPath: /mnt/vol

It completes just fine, including the expected output on the second step, reading from the volume after the output artifact upload (to the default artifact storage here).

The logs of the wait container on the first step correctly show:

time="2020-12-12T16:01:18.798Z" level=info msg="Saving output artifacts"
time="2020-12-12T16:01:18.798Z" level=info msg="Staging artifact: hello"
time="2020-12-12T16:01:18.798Z" level=info msg="Staging /mnt/vol/hello_world.txt from mirrored volume mount /mainctrfs/mnt/vol/hello_world.txt"
time="2020-12-12T16:01:18.798Z" level=info msg="Taring /mainctrfs/mnt/vol/hello_world.txt"
time="2020-12-12T16:01:18.799Z" level=info msg="Successfully staged /mnt/vol/hello_world.txt from mirrored volume mount /mainctrfs/mnt/vol/hello_world.txt"
time="2020-12-12T16:01:18.799Z" level=info msg="S3 Save path: /tmp/argo/outputs/artifacts/hello.tgz, key: volumes-existing-kt9jx/volumes-existing-kt9jx-3077291320/hello.tgz"
time="2020-12-12T16:01:18.799Z" level=info msg="Creating minio client minio:9000 using static credentials"
time="2020-12-12T16:01:18.799Z" level=info msg="Saving from /tmp/argo/outputs/artifacts/hello.tgz to s3 (endpoint: minio:9000, bucket: my-bucket, key: volumes-existing-kt9jx/volumes-existing-kt9jx-3077291320/hello.tgz)"
time="2020-12-12T16:01:18.804Z" level=info msg="not deleting local artifact" localArtPath=/tmp/argo/outputs/artifacts/hello.tgz
time="2020-12-12T16:01:18.804Z" level=info msg="Successfully saved file: /tmp/argo/outputs/artifacts/hello.tgz"

So I think it works as expected and is good to go :) Thank you!

antoniomo on 12 Dec 2020

Was this page helpful?

0 / 5 - 0 ratings