Opentelemetry-specification: TraceID & SpanID are specified as "byte arrays" but retrieval is unspecified

Created on 24 Jul 2020  路  9Comments  路  Source: open-telemetry/opentelemetry-specification

There is an inconsistency in how opentelemetry-js and opentelemetry-ruby, opentelemetry-go, and so forth handle the TraceID and SpanID fields on Span.

Most of OTel's language implementations (e.g. ruby, go, python) return a byte array (or integral representation), others (like js) return a hex stringified representation of the byte array. My understanding is that the hex string representation should be used for serialization/deserialization but not for the internal representation returned by Span.TraceID()

api p1 required-for-ga trace

Most helpful comment

I don't think we should specify this. In different languages, different things may be more efficient.

All 9 comments

A third alternative (currently used in the rust implementation) is a new opaque type that doesn't couple the ids to their underlying data representation.

This was brought up earlier by @flarna for JS as a performance optimization. After some benchmarks showed only a very minor improvement, the effort was dropped due to lack of support and the effort required. https://github.com/open-telemetry/opentelemetry-js/issues/698

One complicating factor in JS is that the Buffer class typically used for "byte arrays" in JS is not available in browsers.

In Java we're exploring switching from opaque types to strings since it's very common for the strings to be needed anyways so that keeps things a little simpler.

https://github.com/open-telemetry/opentelemetry-java/pull/1374

from the spec sig mtg, triaged this as P1, assigning initially to @lizthegrey since it sounds like she's working on this in golang sig

I don't think we should specify this. In different languages, different things may be more efficient.

I think if the goal is purely efficiency, leaving unspecified makes sense. I'd consider whether there is a UX advantage to specifying retrieval as strings. I think in practice, users see IDs as strings in their trace console, logs, etc. So a user calling something like getTraceId() returning a String seems very intuitive.

I think we can recommend having a convenience getter with a string representation, regardless of how the IDs are stored internally. But languages should document if that one is doing an "expensive" conversion or not, and offer cheaper alternatives if applicable.

I think we can recommend having a convenience getter with a string representation, regardless of how the IDs are stored internally

+1. This is how we did it back in OpenTracing and seemed to work great.

We need to support both ways to retrieve the IDs:

  • base16 (I prefer it instead of hex) for w3c propagation and log correlation
  • bytes for OTLP and other binary propagation mechanism

The API should not expose details about how they are internally stored.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

yurishkuro picture yurishkuro  路  5Comments

pavolloffay picture pavolloffay  路  4Comments

cijothomas picture cijothomas  路  3Comments

maxgolov picture maxgolov  路  4Comments

XSAM picture XSAM  路  3Comments