Argo: Container sequences

Created on 31 Mar 2020  ·  9Comments  ·  Source: argoproj/argo

Summary

It should be possible to run multiple steps within the same pod using ephemeral containers.

Motivation

  • Avoids the need to pass artifacts around.

Proposal

TODO



Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

enhancement

Most helpful comment

Seems like a great idea and very useful. Just a couple thoughts:

  • Ephemeral containers are in alpha and have a lot of downsides (https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/#feature-stages)
  • Docs suggest use-case is generally things like debugging running containers. I wonder if there will be an gotchas with using a lot of them for heavier tasks like data processing.

For argo on production clusters, it might be a capability not exercised for a while. That said, benefits might outweigh the risks for certain use-cases.

All 9 comments

Would that work for: Avoids the need to pass output parameters around?

Yes! That's the main advantage

The main advantage of this feature would be to avoid passing artifacts using an external provider between different tasks in a Workflow, when the intermediary artifacts can be discarded after use.

To achieve this, we would make use of ephemeral containers in K8s. The idea is that the controller would create and remove ephemeral containers in a single pod, allowing them to all use the same filesystem

I envision something like a steps template:

- name: sequence
  sequence:
    - - name: create-artifact
        template: gen-data
    - - name: consume-artifact
        template: process-data

- name: gen-data
  container:
    ...
  outputs:
    artifacts:
      file: ...

- name: process-data
  inputs:
    artifacts:
      file: ...
  container:
    ...

Ideally, users would simply be able to rename steps to sequence in order to leverage this feature. The controller would only need the existing inputs/outputs already found in templates to achieve this.

NOTE: This feature is still only an idea: we're about to start creating a PoC to see just how viable it is. Nothing is set in stone (not even the name sequence) and I expect this to change as we learn more about the limitations/features of this. All feedback is welcome at this time!

Seems like a great idea and very useful. Just a couple thoughts:

  • Ephemeral containers are in alpha and have a lot of downsides (https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/#feature-stages)
  • Docs suggest use-case is generally things like debugging running containers. I wonder if there will be an gotchas with using a lot of them for heavier tasks like data processing.

For argo on production clusters, it might be a capability not exercised for a while. That said, benefits might outweigh the risks for certain use-cases.

You are very much correct @ddseapy. We are definitely treating this as an experimental feature

An update on this: given some limitations placed by K8s on this feature – mainly the inability to replace or modify individual ephemeral containers in a Pod and only replace the _entire_ list of ephemeral containers as an operation – we don't think this feature as described is currently feasible.

However, I'll investigate if we can take advantage of this feature for other purposes, such as a streamlined "Retry" node that performs its retries on the same Pod, saving the need to create new ones and download artifacts every time.

@simster7 could you please close this issue this feature is not possible and open a new issues for "in-place retries" so that issues 👍 is reflective of the popularity of that issue?

@simster7 bump!

Closing this as it is currently implausible. Related: https://github.com/argoproj/argo/issues/3475

Was this page helpful?
0 / 5 - 0 ratings