Opentelemetry-specification: Add proper error reporting before OpenTelemetry goes GA

Created on 12 May 2020  路  8Comments  路  Source: open-telemetry/opentelemetry-specification

Reporting errors is an important aspect of tracing/monitoring/observability systems and we should make sure to add proper ways to do so to OpenTelemetry before we go GA without it. Currently we only have the span status defined in the API to mark spans as failed but nothing beyond that.

The current (mostly neglected) proposals out there are:

There also used to be some discussion regarding this on here:
https://gitter.im/open-telemetry/error-events-wg

This issue serves as a tracking issue for milestone v0.5 so we don't forget about error reporting :-)

error-reporting p2 required-for-ga trace

Most helpful comment

I'd even argue that the span status does not mark a span as failed. E.g. HTTP 4xx status codes are often not really errors but reported with a non-OK span status. This is discussed in issue https://github.com/open-telemetry/opentelemetry-specification/issues/306 (IMHO very relevant to this discussion). EDIT: There are also some related notes in the SIG meeting notes from 10/29/2019 #21

All 8 comments

I'd even argue that the span status does not mark a span as failed. E.g. HTTP 4xx status codes are often not really errors but reported with a non-OK span status. This is discussed in issue https://github.com/open-telemetry/opentelemetry-specification/issues/306 (IMHO very relevant to this discussion). EDIT: There are also some related notes in the SIG meeting notes from 10/29/2019 #21

Does it make sense to align logs and errors? In some systems (including Go) errors are reported as logs / via logging API - therefore whatever is decided in https://github.com/open-telemetry/oteps/pull/97 will heavily influence this.

Does it make sense to align logs and errors? In some systems (including Go) errors are reported as logs / via logging API - therefore whatever is decided in open-telemetry/oteps#97 will heavily influence this.

@vmihailenco
It could make sense to align them but we certainly need error reporting for GA and logs are out of scope for this, if I'm not mistaking. @tigrannajaryan, am I right?

Does it make sense to align logs and errors? In some systems (including Go) errors are reported as logs / via logging API - therefore whatever is decided in open-telemetry/oteps#97 will heavily influence this.

I believe we should align Events in Spans (where errors supposedly should/can be recorded) and Standalone Logs. I added this to capture the alignment effort: https://github.com/open-telemetry/opentelemetry-specification/issues/622

It could make sense to align them but we certainly need error reporting for GA and logs are out of scope for this, if I'm not mistaking. @tigrannajaryan, am I right?

@arminru I think it will still be beneficial to make this alignment even if logs are not in scope and will not be GA-ed at the same time as traces.

I hope adding this to required-for-ga is uncontroversial.

@andrewhsu @open-telemetry/technical-committee I believe this should now be closed by open-telemetry/oteps#136

Closing since this is about to be resolved by implementing of OTEP 136 which is already tracked in #965.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

naseemkullah picture naseemkullah  路  5Comments

carlosalberto picture carlosalberto  路  4Comments

pavolloffay picture pavolloffay  路  4Comments

SergeyKanzhelev picture SergeyKanzhelev  路  4Comments

yurishkuro picture yurishkuro  路  5Comments