Argo: Execute Argo Workflow as CRON

Created on 22 Feb 2019  路  23Comments  路  Source: argoproj/argo

Is this a BUG REPORT or FEATURE REQUEST?:
Feature Request

What happened:
Execute Workflows as Cron

What you expected to happen:
I would like to use Argo Workflow to create Database Backups in Kubernetes.
The process first generates a Backup of a Postgres DB, then it encrypts the Backup and in last step it uploads the backup. (This is maybe something were I don't need Argo, but...)
We also maintain Oracle Databases and there a backup is a bit more tricky:

  1. I ask Oracle to create a backup and wait that it is finished
  2. I download the Backup from Oracle File Storage
  3. I encrypt the Backup
  4. I upload the backup
  5. I delete the Backup on Oracle File Storage

To make this continuous it would be nice to have a cron in a Workflow. Like the Kubernetes CronJob with a CRON Pattern.

Most helpful comment

Yep, just like the CronJob vs Job, I think we should provide a separate CronWorkflow CRD makes sense. But for the CronJob and Job, the controller is the same kube-controller-manager.

Can we just provide a new CronWorkflow CRD in the argo workflow project and use the same controller ? In that case, we can reuse argo command, like argo cron list or argo cron submit, a sub command will much more convenience and we can reuse much more code. User can also just deploy argo once, and they can use cron or normal workflow in the same env.

apiVersion: argoproj.io/v1alpha1
kind: CronWorkflow                
metadata:
  generateName: cron-hello-world-    
spec:
  schedule: "*/5 * * * *"
  entrypoint: whalesay          # invoke the whalesay template
  templates:
  - name: whalesay              # name of the template
    container:
      image: docker/whalesay
      command: [cowsay]
      args: ["hello world"]
      resources:                # limit the resources
        limits:
          memory: 32Mi
          cpu: 100m

We can also run argo submit --from=cronWorkflow/<cronWorkflow> <name>, make CronWorkflow to a normal Workflow.

@jessesuen WDYT ?

All 23 comments

Read first then write, sorry for this!

@DaspawnW or anyone else, can you post the link

i think @DaspawnW might be referring to either:

  • instructions, described here, for using a kubernetes cronjob which runs a container where argo submit is executed to kick off the workflow

or

i don't like the second option because it seems to require inlining my workflow into the sensor definition as a string -- not sure why i can't just specify it as data since i'm just putting a yaml file inside a yaml file.

maybe there's some other way to have a workflow run regularly, but i haven't found it by reading the docs and googling for the last hour or so

Following this, we tried with argo-events, but it found it being quite complicated, so finally we ended creating a new project that simply fetches a Git repository and triggers the Argo Workflow using a K8s Cronjob.
Benefits are the control over the concurrency of workflows and that is quite simple to configure!
We also set it up so it clears old workflows to avoid filling the cluster with already finished containers.

https://github.com/bitphy/argo-cron

How we can help in this community then!

@jessesuen @sarabala1979 Should we support this cron workflow native ?

apiVersion: argoproj.io/v1alpha1
kind: Workflow                  # new type of k8s spec
metadata:
  generateName: hello-world-    # name of the workflow spec
spec:
  schedule: "*/5 * * * *"
  entrypoint: whalesay          # invoke the whalesay template
  templates:
  - name: whalesay              # name of the template
    container:
      image: docker/whalesay
      command: [cowsay]
      args: ["hello world"]
      resources:                # limit the resources
        limits:
          memory: 32Mi
          cpu: 100m

I think I understand the reason for decouple the execution of a workflow using argo-events.
But I think scheduling is a concept that very basic in terms of workflows. It is not intuitive for users when you tell them that they need to define the scheduling somewhere else. Also, if you want to be able to schedule and also run it on demand, you will need to write it twice (inline/config-map in the sensor or a workflow file) which is not so good.

I think it will be very useful to define the scheduling as part of the workflow and allow to create a workflow from a scheduled workflow, similar to the way it is done in CronJob
kubectl create job --from=cronjob/<cronjob-name> <job-name>

@Shimi Yep, I agree.

I will try to implement this feature in the future.

apiVersion: argoproj.io/v1alpha1
kind: Workflow                  # new type of k8s spec
metadata:
  generateName: hello-world-    # name of the workflow spec
spec:
  schedule: "*/5 * * * *"

@xianlubird, I think a separate CronWorkflow CRD makes more sense than adding this functionality directly to Argo Workflows. It is a more a proper separation of concerns (following the same pattern of CronJob vs. Job). I also think feel this should be a separate controller and project.

Would you be interested in starting this off as an https://github.com/argoproj-labs project?

Yep, just like the CronJob vs Job, I think we should provide a separate CronWorkflow CRD makes sense. But for the CronJob and Job, the controller is the same kube-controller-manager.

Can we just provide a new CronWorkflow CRD in the argo workflow project and use the same controller ? In that case, we can reuse argo command, like argo cron list or argo cron submit, a sub command will much more convenience and we can reuse much more code. User can also just deploy argo once, and they can use cron or normal workflow in the same env.

apiVersion: argoproj.io/v1alpha1
kind: CronWorkflow                
metadata:
  generateName: cron-hello-world-    
spec:
  schedule: "*/5 * * * *"
  entrypoint: whalesay          # invoke the whalesay template
  templates:
  - name: whalesay              # name of the template
    container:
      image: docker/whalesay
      command: [cowsay]
      args: ["hello world"]
      resources:                # limit the resources
        limits:
          memory: 32Mi
          cpu: 100m

We can also run argo submit --from=cronWorkflow/<cronWorkflow> <name>, make CronWorkflow to a normal Workflow.

@jessesuen WDYT ?

@xianlubird I'm happy to help to implement this feature. If you have started the implementation, let me know I'm ready to work on this.

@Sharathmk99 Will you implement this in argo-controller ?

@xianlubird yes sure with you guidance. I'll get started.

@xianlubird , @Sharathmk99 , when you implement this CronWorkflow, can you please include these features similar to k8s CronJob: successfulJobsHistoryLimit, failedJobsHistoryLimit and .spec.concurrencyPolicy. Thanks.

This was closed immediately after opening but has a lot of good discussion after. Can it be re-opened so it better reflects the current state?

I鈥檓 going to create a general cron resource operator which allows you to schedule any resource including Workflow. Maybe it can be helpful.

@dtaniwaki cool

@dtaniwaki Thanks. Waiting for this

@dtaniwaki I have something for this. Ping me next week.

@simster7 is working CRONWorkflow. He almost through. He is adding new CRD CronWorkflow to have CRON properties and workflow.

@sarabala1979 I guess great minds think alike. I have similar stuff. Let's get together next week

@llimon @dtaniwaki I've been working on this for the last couple of days internally. Feel free to ping me on the slack channel (@simon) if there's anything you guys want to discuss

@llimon @simster7 Thank you for implementing the feature. My progress is not so big because I was quite busy with other things. Please feel free to go ahead discussing about it between you guys.

https://github.com/argoproj/argo/pull/1758 is now ready for community review and feedback

Was this page helpful?
0 / 5 - 0 ratings