See https://github.com/open-telemetry/opentelemetry-specification/pull/761#discussion_r466313064.
For Go errors, see https://blog.golang.org/go1.13-errors. Go errors are not exceptions, as the defining characteristic is that it has some kind of special throw/try/catch control flow that facilitates separating the "happy path" from the "error path".
Currently we seem to implicitly encourage this in https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/semantic_conventions/exceptions.md#stacktrace-representation that Go should use runtime.debug.Stack (a function that returns the current stack, not directly associated with error handling or error objects).
On the other hand, go also has panics https://blog.golang.org/defer-panic-and-recover which seem to match exceptions far better.
Rust may require similar treatment (it also has panics and a Result type), and some C++ applications (using Boost.Outcome, std::expected, etc).
I can think of two solutions:
CC @open-telemetry/go-approvers
I think that it makes sense to treat panic/recover like throw/catch. Errors are definitely not exceptions, though sometimes they are used similarly, even if only for the fact that they can be easily (even inadvertently) ignored, which stops propagation through the call stack.
This is a blocker for Go, but is this a blocker for the spec? Do we believe that while working on this we may find that Go requires something that makes current exception specification impossible to use?
I don't think so. We might just discover that Go needs additional attributes or a different convention. E.g. maybe Go will use an event attribute "unwinding=false" to distinguish panics from error returns. So I think we can remove the required-for-ga here.
@open-telemetry/go-maintainers ptal, fyi this one was moved to after-ga, and respond if this is not desirable.
Looking at the specification for handling exceptions I wonder if the canonical Go implementation would fit. Currently we provide a way to record errors with a specific method of a span. This seems in line with the purpose of the specification on exception handling. However, the canonical way to deal with panics in Go is to recover in a deferred function that is run when the current scope ends. This means that we have added panic handling in the Span.End method to add an appropriate event capturing the relevant information.
I think this is in line with the spirit of what the specification is trying to solve, but it does not fit the specific language of the specification. If this seems fine and we all agree that it would not block the Go implementation from saying it implements the spec I'm fine with waiting post-GA to update the language to accommodate this implementation.
It looks like this may indeed block the OTel Go release based on the fact that backends are using the exception semantics to represent all errors within the system and Go does not send errors in this form.
I don't think languages are supposed to invent & use their own semantic conventions without submitting them as a PR to the spec. So if go uses an error. prefix but otherwise the same semantic conventions, I think we need to do something, not (only) in backends. We could try to put that into the spec as used by Go now. But I wonder what e.g. Rust uses (do we have any other OTel languages that also use errors instead of exceptions?).
If exception conventions are a good fit except for the name, I would actually prefer adding a new attribute to the convention that describes the kind of "exception", i.e. whether it is a classical stack-unwinding exception or something else.
See also https://cloud-native.slack.com/archives/C01N7PP1THC/p1615194505073700?thread_ts=1614978903.067900&cid=C01N7PP1THC and https://github.com/open-telemetry/opentelemetry-specification/issues/764#issuecomment-689433475
I believe that Rust has decided to emit exception., even though they use errors. We have a PR in the Go repo that would change our RecordError method to also produce events with the exception. attributes. I'm fine with landing that, despite the conceptual mismatch, precisely because consistency for downstream processors is important. If we could add a new attribute to communicate the category or form of the exception/error that may be a good way to bridge some of that conceptual gap.
Most helpful comment
Looking at the specification for handling exceptions I wonder if the canonical Go implementation would fit. Currently we provide a way to record
errors with a specific method of a span. This seems in line with the purpose of the specification on exception handling. However, the canonical way to deal with panics in Go is torecoverin a deferred function that is run when the current scope ends. This means that we have added panic handling in theSpan.Endmethod to add an appropriate event capturing the relevant information.I think this is in line with the spirit of what the specification is trying to solve, but it does not fit the specific language of the specification. If this seems fine and we all agree that it would not block the Go implementation from saying it implements the spec I'm fine with waiting post-GA to update the language to accommodate this implementation.