This issue is to document a proposal and facilitate discussion on having an alternative model of workflow execution using a bottoms-up DAG dependency based approach vs. our current model of a top-down steps-based approach. We would still continue to support the steps-based execution model, but provide the option to execute workflows using the DAG-based model.
The idea behind a dependency approach is similar in concept to a Makefile, where you have a target, that target has dependencies, which in turn have their own dependencies. The execution engine would figure out a resolution order in order to reach the desired target, and run the nodes in the most optimal way possible (essentially when all of a target's dependencies have been satisfied).
Consider the following run-of-the-mill diamond workflow pattern which fans out and fans back in:
/ \
TWO THREE
\ /
FOUR
The following is how it might be represented in our workflow yaml by declaring dependencies between targets (aka nodes):
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: diamond-dag-
spec:
target: four
templates:
- name: whalesay
inputs:
parameters:
- name: message
container:
image: docker/whalesay:latest
command: [cowsay]
args: ["{{inputs.parameters.message}}"]
targets:
- name: one
template: whalesay
arguments:
parameters: [{name: message, value: one}]
- name: two
dependencies:
- one
template: whalesay
arguments:
parameters: [{name: message, value: two}]
- name: three
dependencies:
- one
template: whalesay
arguments:
parameters: [{name: message, value: three}]
- name: four
dependencies:
- two
- three
template: whalesay
arguments:
parameters: [{name: message, value: four}]
This is how we are able to achieve the same functionality today:
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: diamond-steps-
spec:
entrypoint: diamond
templates:
- name: whalesay
inputs:
parameters:
- name: message
container:
image: docker/whalesay:latest
command: [cowsay]
args: ["{{inputs.parameters.message}}"]
- name: diamond
steps:
- - name: one
template: whalesay
arguments:
parameters: [{name: message, value: one}]
- - name: two
template: whalesay
arguments:
parameters: [{name: message, value: two}]
- name: three
template: whalesay
arguments:
parameters: [{name: message, value: three}]
- - name: four
template: whalesay
arguments:
parameters: [{name: message, value: four}]
Below are some of the key benefits of the DAG model which I will attempt to outline:
Highly optimized target execution. Execution of the nodes in the DAG will be highly optimized because we would be able to execute nodes as soon as its dependencies have been satisfied.
Flexibility in control flow. With today's steps based model, we can approximate a DAG, but never truly achieve the same level of functionality.
However, there are some key drawbacks for DAG workflows
Difficult mental model. The bottoms up approach of building dependencies can be somewhat of an unnatural way to think of some workflows. The top down approach would feel more natural for workflows which are more closer to a script than a DAG. It may also be harder to understand and express branching in a DAG.
Dynamic workflows would be difficult to achieve. Consider the coinflip-recursive.yaml example, which is an example of a dynamic workflow which recursively runs a template until it reaches a desired outcome. With DAGs, targets are typically pre-defined at submission and dynamically injecting new targets on the fly, set up relationships between them for something like a recursive workflow would be difficult. I also do not believe we would want to allow something like circular dependency of targets.
CLI rendering. Our steps-based workflow is easily represented as a tree with a single root. Each node has a predictable pattern (every template is either a leaf, or a node which fans out and fans back in). Due of this, we are able to render the tree quite easily in the CLI. But since DAGs are completely free-form in their relationships between nodes, I do not think it will be feasible to render a free-form DAG from the CLI.
In the end, my belief is that users may still end up preferring a steps-based approximation because of its simpler mental model.
Here is a simple, concrete example of a DAG, which does not have a good approximation using a steps-based execution (node dependencies goes downward):
A B
/ \ /
C D
The closest approximation of this using steps would be something like:
*
/ \
A B
\ /
*
/ \
C D
\ /
*
As you can see, C's execution gets delayed since it waits on B unnecessarily.
After discussion the decision is to make dag a new template type, which defines a list of tasks from within that overall encompassing step. Example below:
# The following workflow executes a multi-root workflow
#
# A B
# / \ /
# C D
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: diamond-dag-
spec:
entrypoint: multiroot
templates:
- name: echo
inputs:
parameters:
- name: message
container:
image: alpine:3.7
command: [echo, "{{inputs.parameters.message}}"]
- name: multiroot
dag:
tasks:
- name: A
template: echo
arguments:
parameters: [{name: message, value: A}]
- name: B
dependencies:
template: echo
arguments:
parameters: [{name: message, value: B}]
- name: C
dependencies: [A]
template: echo
arguments:
parameters: [{name: message, value: C}]
- name: D
dependencies: [A, B]
template: echo
arguments:
parameters: [{name: message, value: D}]
What do you think about actor model?
@bhack i'm unfamiliar with that concept. Is there a framework that uses this pattern that you could point to so I can understand further?
Take a look to this specific Go presentation
Another quite popular Go project is https://github.com/AsynkronIT/protoactor-go
Also for scientific workflows. There is also a related video
Thanks for the links -- I watched the video. Based on what I've learned, actor models seem more of a programming paradigm requiring a framework/SDK to handle the plumbing and messaging between nodes and their actors. This issue is more about how to support a traditional dependency based DAG definition within an container-based workflow. It's definitely an interesting model which we may draw concepts from in the future, but seems be orthogonal to the issue at hand, which is defining a DAG in k8s yaml.
Yes it is a little bit orthogonal to this issue cause the message passing component of the actor model pattern doesn't exist on the shelf in K8 core. But could be an interesting workflow execution paradigm cause could include explicit data exchange definition in the workflow execution model.
@jlewi FYI
This is great!
I think this will make it easier to replace Airflow with Argo since Airflow offers the ability to do arbitrary DAGS.
I think in some ways specifying the dependencies is much easier than trying to manually flatten the graph into the current representation.
My general expectation though is that for really complicated graphs you'd want to use higher level tooling to generate them and not write them by hand.
Yes, we also think higher level tooling is needed for complex workflows.
We're also experimenting with invoking Airflow workflows from Argo workflows. This would let users mix and match existing Airflow operators/DAGs with container based workflows and leverage the huge catalog of existing container images.
DAG support is now in master and will be available as part of the upcoming v2.1 release
Most helpful comment
After discussion the decision is to make dag a new template type, which defines a list of tasks from within that overall encompassing step. Example below: