Che: Opentracing support for k8s / OpenShift infrastructures in Che

Created on 5 Jul 2018 · 5Comments · Source: eclipse/che

Introduction

Opentracing is vendor-neutral APIs and instrumentation for distributed tracing which allows to switch tracer implementation with O(n) complexity (e.g. switching between jaeger & zipkin is just a mater of updating the tracer config & no need to update opentracing related code).

Motivation

When opentracing support would be added to Che this would allow to collect metrics like request duration / error rate easily. This would be particularly useful for multi-user use case and would be also handy for single-user in order to identify potential bottle-necks in the system.

Demo

A short demo with collecting metrics between che-starter and che-server using zipkin as a tracer - https://youtu.be/4tWeH8JqQQk

Tasks

[x] Implementation plan https://github.com/eclipse/che/issues/11720
[x] creating CQs for opentracing & jaeger and resolving dependency conflicts
[ ] adding tracer agnostic opentracing support to wsmaster (jaeger would be used as the default tracer)
[x] adding yml templates with instrustions for deployment jaeger to k8s / openshift, so that this jaeger instance would be used for collecting che metrics
[ ] adding opentracing support to wsagent in order to have a correlation between wsmaster -> wsagent interaction
[ ] adding opentracing support to wsnext
[ ] Tracing workspace related operation https://github.com/eclipse/che/issues/11922
[ ] Tracing operation made over JSON-RPC https://github.com/eclipse/che/issues/11923
[x] Tracing JDBC https://github.com/eclipse/che/pull/11905
[ ] Generic telemetry events infrastructure https://github.com/eclipse/che/issues/5483
[x] Trace workspace stopping in more detail https://github.com/eclipse/che/issues/12002
[ ] Trace ChePlugins and how they are applied to workspaces https://github.com/eclipse/che/issues/12003
[x] Jaeger deployment for Kubernetes https://github.com/eclipse/che/issues/12013
Q/A

Is Tracing is the same as logging ?
No. Tracing is mostly focused on metrics like request duration / error rate in the microservice environment with possibility to identify bottle-necks across the system.

Should tracing (instrumentation) be always separate from logging ?
Every case is a case, but in general it is common to separate tracing from logging.Correlation between logs and traces happens by passing trace_id to every log entry. On the other hand, frameworks like Spring Cloud provide a possibility to record log entries within the relevant tracing spans. To put it in a nutshell, it depends on the application. If performance is critical adding all logs to tracing spans would appear to be costly operations, but if it is just a small service where performance is not an issue, it makes sense to store logging with spans. For Che case opentracing makes the most sense for multi-user user-case so it would be better to fallback on the approach with separating logs & traces and add logs to the tracing spans only if this data would be useful for investigation of metrics like request duration / error rate.

kinepic

Source

ibuziuk

🎉2 ❤1 😄1

All 5 comments

I agree to the separation between logs and traces.
They have separate purposes. Logging is critical part for analysis and should have essential information to detect problems while tracing should mainly detect latency and bottleneck and should have minimal data to avoid severe performance issues.
So If both are required in Che it should be better to separate between logging and tracing even for small services.