Argo: Proposal: argo logs workflow-xxx

Created on 6 Jan 2018 · 5Comments · Source: argoproj/argo

Perhaps it would be useful if argo logs workflow-xxx cat'ed all the logs for the sub steps.

enhancement

Source

edlee2121

👍2

Most helpful comment

The questions is how fancy do we want to get with logging all pods in the workflow. A simple approach may be to loop over each pod and emit the logs of each, one-by-one.

A better approach would be to use a docker-compose style style of logs where log outputs are merged into one stream, but we would prefix the logs with the step name. I think this would be much more useful when comparing the timings of the workflow.

I think we could develop a relatively straight forward merge sort style of algorithm for achieving the above:
1.kubectl logs <podname> stored into temporary files (which preserves the log timestamps)

perform a merge of the logs

Some optimizations would be made to only kubectl logs the pods in order of their real-world execution (so we don't work on pods at the end of the workflow prematurely)

jessesuen on 8 Jan 2018

👍3

All 5 comments

The questions is how fancy do we want to get with logging all pods in the workflow. A simple approach may be to loop over each pod and emit the logs of each, one-by-one.

perform a merge of the logs

Some optimizations would be made to only kubectl logs the pods in order of their real-world execution (so we don't work on pods at the end of the workflow prematurely)

jessesuen on 8 Jan 2018

👍3

Definitely like the idea of labeling with step name and then merging based on time.

edlee2121 on 8 Jan 2018

👍 for this proposal

dvavili on 13 Jan 2018

I'm wondering if this is another problem that would be solved by having K8s provide good support for tagging and fetching logs using pod labels. (This is a big issue for Kubeflow). We have the same problem for TFJobs where we have multiple pods and people want to see the logs for specific pods or all pods.

My general sense is that this should be solved by having all logs tagged with the pod labels. Then CRDs just need to attach appropriate pod labels. So for Argo it could be the workflow name, step name etc..., for TFJob it would be job name, replica index, etc...

Then a user could just fetch the logs by querying by pod label either via CLI or by whatever mechanism the backend supports (e.g. stackdriver UI).

From what I gather from the logging folks, this will be done by using side cars in the pod to attach pod labels to logs before shipping them out.