Argo: archive logs bug: failed to save outputs: timed out waiting for the condition

Created on 24 Feb 2020  路  6Comments  路  Source: argoproj/argo

Checklist:

  • [x] I've included the version.
  • [x] I've included reproduction steps.
  • [x] I've included the workflow YAML.
  • [x] I've included the logs.

What happened:
A simple workflow fails saving the logs to S3.

What you expected to happen:
Saving logs to S3 correctly

How to reproduce it (as minimally and precisely as possible):
Workflow

metadata:
  name: lovely-rhino
  namespace: argo
spec:
  workflowSpec:
    templates:
      - name: whalesay
        inputs: {}
        outputs: {}
        metadata: {}
        container:
          name: main
          image: 'docker/whalesay:latest'
          command:
            - cowsay
          args:
            - hello world
          resources: {}
    entrypoint: whalesay
    arguments: {}
  schedule: 30 13 * * *
  concurrencyPolicy: Forbid

workflow-controller-configmap

artifactRepository:
  archiveLogs: true
  s3:
    endpoint: s3.amazonaws.com
    bucket: example-argo-logs
    keyPrefix: argo-logs
    region: eu-west-1

Anything else we need to know?:

Environment:

  • Argo version:
$ argo version
2.6.0-rc2
  • Kubernetes version :
$ kubectl version -o yaml
clientVersion:
  buildDate: "2019-06-19T16:40:16Z"
  compiler: gc
  gitCommit: e8462b5b5dc2584fdcd18e6bcfe9f1e4d970a529
  gitTreeState: clean
  gitVersion: v1.15.0
  goVersion: go1.12.5
  major: "1"
  minor: "15"
  platform: darwin/amd64
serverVersion:
  buildDate: "2019-12-22T23:14:11Z"
  compiler: gc
  gitCommit: c0eccca51d7500bb03b2f163dd8d534ffeb2f7a2
  gitTreeState: clean
  gitVersion: v1.14.9-eks-c0eccc
  goVersion: go1.12.12
  major: "1"
  minor: 14+
  platform: linux/amd64

Other debugging information (if applicable):

  • workflow-controller logs:
kubectl logs -n argo $(kubectl get pods -l app=workflow-controller -n argo -o name)
time="2020-02-24T13:48:02Z" level=info msg="Processing workflow" namespace=argo workflow=lovely-rhino-846d9
time="2020-02-24T13:48:03Z" level=info msg="Processing workflow" namespace=argo workflow=lovely-rhino-846d9
time="2020-02-24T13:48:03Z" level=info msg="Updating node &NodeStatus{ID:lovely-rhino-846d9,Name:lovely-rhino-846d9,DisplayName:lovely-rhino-846d9,Type:Pod,TemplateName:whalesay,TemplateRef:nil,Phase:Pending,BoundaryID:,Message:ContainerCreating,StartedAt:2020-02-24 13:48:00 +0000 UTC,FinishedAt:0001-01-01 00:00:00 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:,} status Pending -> Running"
time="2020-02-24T13:48:03Z" level=info msg="Workflow update successful" namespace=argo phase=Running resourceVersion=8373555 workflow=lovely-rhino-846d9
time="2020-02-24T13:48:03Z" level=info msg="Enforcing history limit for 'lovely-rhino'"
time="2020-02-24T13:48:04Z" level=info msg="Processing workflow" namespace=argo workflow=lovely-rhino-846d9
time="2020-02-24T13:48:17Z" level=info msg="Alloc=8406 TotalAlloc=97908 Sys=70080 NumGC=20 Goroutines=103"
time="2020-02-24T13:48:35Z" level=info msg="Processing workflow" namespace=argo workflow=lovely-rhino-846d9
time="2020-02-24T13:48:35Z" level=info msg="Processing workflow" namespace=argo workflow=lovely-rhino-846d9
time="2020-02-24T13:48:35Z" level=info msg="Updating node &NodeStatus{ID:lovely-rhino-846d9,Name:lovely-rhino-846d9,DisplayName:lovely-rhino-846d9,Type:Pod,TemplateName:whalesay,TemplateRef:nil,Phase:Running,BoundaryID:,Message:,StartedAt:2020-02-24 13:48:00 +0000 UTC,FinishedAt:0001-01-01 00:00:00 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:,} status Running -> Error"
time="2020-02-24T13:48:35Z" level=info msg="Updating node &NodeStatus{ID:lovely-rhino-846d9,Name:lovely-rhino-846d9,DisplayName:lovely-rhino-846d9,Type:Pod,TemplateName:whalesay,TemplateRef:nil,Phase:Error,BoundaryID:,Message:,StartedAt:2020-02-24 13:48:00 +0000 UTC,FinishedAt:0001-01-01 00:00:00 +0000 UTC,PodIP:,Daemoned:nil,Inputs:nil,Outputs:nil,Children:[],OutboundNodes:[],StoredTemplateID:,WorkflowTemplateName:,TemplateScope:,} message: failed to save outputs: timed out waiting for the condition"
time="2020-02-24T13:48:35Z" level=info msg="Updated phase Running -> Error" namespace=argo workflow=lovely-rhino-846d9
time="2020-02-24T13:48:35Z" level=info msg="Updated message  -> failed to save outputs: timed out waiting for the condition" namespace=argo workflow=lovely-rhino-846d9
time="2020-02-24T13:48:35Z" level=info msg="Marking workflow completed" namespace=argo workflow=lovely-rhino-846d9
time="2020-02-24T13:48:35Z" level=info msg="Checking daemoned children of " namespace=argo workflow=lovely-rhino-846d9
time="2020-02-24T13:48:35Z" level=info msg="Workflow update successful" namespace=argo phase=Error resourceVersion=8373652 workflow=lovely-rhino-846d9
time="2020-02-24T13:48:35Z" level=warning msg="Workflow 'lovely-rhino-846d9' from CronWorkflow 'lovely-rhino' completed"


Message from the maintainers:

If you are impacted by this bug please add a 馃憤 reaction to this issue! We often sort issues this way to know what to prioritize.

bug

Most helpful comment

Yes, I forgot to leave the solution. The pod needs needs the IAM permission to write to S3.

All 6 comments

Can you please examine the wait containers logs? This might shed more light on the problem.

Why has this been closed? The issue is not within the container itself, I believe, but a time-out in the Kubernetes API causing the step to be flagged as an error, which is incorrect.

@rmgpinto How did you resolve your issue?

Yes, I forgot to leave the solution. The pod needs needs the IAM permission to write to S3.

In my case, minio was filling up its PVC and running out of space.

It worked for me after configuring the GCS artifact repository in workflow-controller-configmap.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

logicfox picture logicfox  路  4Comments

nelsonfassis picture nelsonfassis  路  4Comments

iterion picture iterion  路  3Comments

alexlatchford picture alexlatchford  路  3Comments

mac9416 picture mac9416  路  4Comments