What are you trying to achieve?
After the merge of this PR: https://github.com/open-telemetry/opentelemetry-specification/pull/966, we can see that the StatusCanonicalCode got changed to Error, Unset, and Ok.
But, if we see the rpc status: https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/semantic_conventions/rpc.md#status, they kept the same.
Additional context.
Should the status be the same? How should we map the rpc status to the new StatusCanonicalCode? For example: Unkown is an error or an unset status?
cc: @cijothomas @reyang
@tedsuo
Good catch, this needs to be updated. Thanks @eddynaka!
In order to resolve this, we'll need to do two things:
rpc.grpc.status_code) to store the GRPC Status Code, which would otherwise be lost now that we have a different and considerably reduced set of Status Codes in OTelHi @arminru , thanks for mapping!
Do you have a timeline to be official in the spec?
@arminru I think the mapping will have to match the mapping we have defined in the proto, see https://github.com/open-telemetry/opentelemetry-proto/pull/224. And there AFAIK the current proposal is OK -> UNSET, everything else -> ERROR. EDIT: Oh, we have the reverse mapping there and it is not unambiguously reversible because both unset and ok map to OK. But we should be consistent, and should never set Ok from instrumentation.
@carlosalberto from the issue triage mtg today, assigning to you since you previously worked in this area
Just to get a clear response. Would that be right the mapping below?
For HttpStatusCode:
>= 100 && <= 399 -> UNSETFor RPC:
OK -> UNSETFor HTTP, the mapping is already defined (indeed like you describe it) here: https://github.com/open-telemetry/opentelemetry-specification/blob/4b19e006ede63471103925edc6a15ab1d7579944/specification/trace/semantic_conventions/http.md#status
For RPC, I think that mapping makes sense, but we need to specify it in the semantic conventions. A PR would be welcome for that 馃槂
@Oberon00 oh yes of course. I forgot that we don't have the status source from the status OTEP in the API (yet?), which would make an instrumentation-determined Ok distinguishable. In this case, your and @eddynaka's proposal makes sense.
In order to resolve this, we'll need to do two things:
- Add a mapping from GRPC Status Codes to our new OTel Status Code (Unset, Ok, Error), e.g.
- OK --> Ok
- CANCELLED and UNKNOWN --> undefined (up to the user) or leaving Unset
- all the others --> Error
- Add a new attribute (e.g.,
rpc.grpc.status_code) to store the GRPC Status Code, which would otherwise be lost now that we have a different and considerably reduced set of Status Codes in OTel
Hi @arminru , for gRPC, we already have the grpc.status_code in our Activity (which is like Span). Can we maintain grpc.status_code instead of rpc.grpc.status_code? This would prevent us from removing one item in the list and re-adding the same item with different name.
@eddynaka
For the existing db.* and messaging.* attributes we use the respective db.system and messaging.system name for namespacing (e.g. db.mssql.instance_name, db.mongodb.collection and messaging.rabbitmq.routing_key) already.
In order to stay consistent here, the RPC semantic convention would need to use use rpc.grpc.* rather than introducing a new, top-level namespace for grpc.* _next to_ rpc.*.
I can see, however, that the repetition of _rpc_ looks a bit odd at first but it's simply the name of the RPC system/flavor being used and also the same with db.mongodb.*, for example.
See
https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/semantic_conventions/database.md#call-level-attributes-for-specific-technologies
and
https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/semantic_conventions/messaging.md#attributes-specific-to-certain-messaging-systems
from the spec issue triage mtg today, after discussing this issue sounds like this can be changed to allowed-for-ga since it is not a blocker for trace spec freeze
I don't understand how this cannot be a blocker for GA. The change in the status code spec stomped all over the status code emitted from RPC spans. Without defining a specific tag to carry that status code the receivers of the traces (collector + exporters) won't have any idea how to report the actual RPC status to backends. My team definitely relies on that breakdown of RPC failure codes.
I'm 100% in favor of passing this information along in an attribute with key "rpc.grpc.status_code". We need something here otherwise the collectors are going to start dropping a key piece of information when they start receiving Spans from SDKs that have implemented this new status spec.
It is not a blocker for release since (1) OpenTelemetry as a whole is usable even without it and (2) it can be added after release as a non-breaking change, if it does not make it into the release.
We just hit this too: https://github.com/open-telemetry/opentelemetry-python/pull/1308. So filed a PR with the results of the discussion.
Most helpful comment
I don't understand how this cannot be a blocker for GA. The change in the status code spec stomped all over the status code emitted from RPC spans. Without defining a specific tag to carry that status code the receivers of the traces (collector + exporters) won't have any idea how to report the actual RPC status to backends. My team definitely relies on that breakdown of RPC failure codes.
I'm 100% in favor of passing this information along in an attribute with key "rpc.grpc.status_code". We need something here otherwise the collectors are going to start dropping a key piece of information when they start receiving Spans from SDKs that have implemented this new status spec.