Tekton pipelines should persistent the logs of old pipeline runs in the configured bucket.
The logs for old pipeline is not available. When I try tp see the logs I get unable to fetch logs with the following message
{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods \"global-pr-checks-7wxpv-set-final-status-4nr8m-pod-jzv8z\" not found","reason":"NotFound","details":{"name":"global-pr-checks-7wxpv-set-final-status-4nr8m-pod-jzv8z","kind":"pods"},"code":404}
Kubernetes version:
Output of kubectl version:
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.8", GitCommit:"9f2892aab98fe339f3bd70e3c470144299398ace", GitTreeState:"clean", BuildDate:"2020-08-13T16:12:48Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"15+", GitVersion:"v1.15.11-eks-065dce", GitCommit:"065dcecfcd2a91bd68a17ee0b5e895088430bd05", GitTreeState:"clean", BuildDate:"2020-07-16T01:44:47Z", GoVersion:"go1.12.17", Compiler:"gc", Platform:"linux/amd64"}
Tekton Pipeline version:
Output of tkn version or kubectl get pods -n tekton-pipelines -l app=tekton-pipelines-controller o=jsonpath='{.items[0].metadata.labels.version}'
v0.16.3
I believe this is by-design and not a bug, but I do wish that it _wasn't_.
Nodes are replaced all the time and Pods will come and go freely over the lifetime of a cluster... it'd be nice if there was still a way to grab the logs for previous PipelineRuns. I'm sure there's a possible issue here with the 1Mb limit etcd places on each K8s API object, so perhaps that's why it's the way it is at the moment.
Question for devs: Is this something that's being investigated?
Logs are not stored in etcd AFAIK, this can be a issue if tekton does not ship the logs to the configured S3 bucket/GCS bucket. Just want to confirm this behaviour. @bobcatfish Can you share your inputs ?
@daviddyball indeed, this is by design. Aggregating logs for Tekton workflows is similar to aggerating logs for any workload running in kubernetes.
The main question is, should Tekton provide a component to aggregate logs somewhere, or should we just rely on existing ones in the kubernetes ecosystem (loki, …) and document it / give some advice on them ? This is something that is "investigated" but not as part of the core component of tekton (aka tektoncd/pipeline).
@vdemeester I totally understand the feature-creep that could be encountered by trying to solve this problem, but the main issue is that the core tkn CLI errors if you try to fetch logs for pods that were removed, which feels broken.
I get that logging is quite a nebulous thing and there are already so many different options for this (Loki, ELK, Datadog etc.).... but in order for an operator to debug a pipeline who's containers are now missing from the system, they have to manually go and scrape their logging system and try and piece together and order the logs, which is honestly more effort than it's worth.
@hprateek43 you mentioned that there is an option to ship logs to object storage... where can I find that option? Are the logs shipped in a nicely readable format (e.g. like what you get from tkn pipelinerun logs?
@vdemeester I totally understand the feature-creep that could be encountered by trying to solve this problem, but the main issue is that the core
tknCLI errors if you try to fetch logs for pods that were removed, which feels broken.I get that logging is quite a nebulous thing and there are already so many different options for this (Loki, ELK, Datadog etc.).... but in order for an operator to debug a pipeline who's containers are now missing from the system, they have to manually go and scrape their logging system and try and piece together and order the logs, which is honestly more effort than it's worth.
@hprateek43 you mentioned that there is an option to ship logs to object storage... where can I find that option? Are the logs shipped in a nicely readable format (e.g. like what you get from
tkn pipelinerun logs?
I completely agree :wink: There is definitely room for improvement in here, and, long-term, tools like tkn should work with getting logs from runs that do not have related pods anymore. Not sure yet how we would do this, but this is something that, I think, we need to support yes :+1:
Most helpful comment
I completely agree :wink: There is definitely room for improvement in here, and, long-term, tools like
tknshould work with getting logs from runs that do not have related pods anymore. Not sure yet how we would do this, but this is something that, I think, we need to support yes :+1: