After https://github.com/open-telemetry/opentelemetry-specification/pull/368 gets merged we will have support for array values.
If we add support for maps and nesting it will allow to represent arbitrary nested data structures in attribute values if needed.
This will apply to span and resource attributes.
Can we explicitly state that this applies to resources too? I believe that span attributes and resources (which are only specified in the proto, currently) are specified with the same structure.
Are there any use cases for arbitrary nesting? I think (multi)maps would be useful to store, e.g., HTTP headers, but what would be the rationale for arbitrary nesting?
Arbitrary nesting map can represent the classified values e.g. {"http" : {"url":...,"method":...}} or {"sql" : {"query":...,"engine":...}}. It can also host vendor specific data like {"aws": {"account_id":...}} in the situations when Resource isn't the right place, e.g. client side metrics
In #579, Tigran's example seems to contain a use-case. The resource of "application B" is a set of key-value attributes.
Semantically, I agree that the dotted string notation is equivalent to a map, although I'd like to point out, that at least from the tracing client perspective, dotted strings are a slightly more efficient representation.
Consider the following representations for the key-value pair 'http.method': 'GET'.
Dotted string representation
{ 'http.method': 'GET }
To represent this we need 1 map and 2 strings; 3 total objects.
Map representation
{ { 'http' : { 'method': 'get' } }
This requires 2 maps, 3 strings; 5 total objects.
Furthermore, most tracing backends do not support nested attributes (as far as I know), and will need to flatten them into dotted strings. This is something that will either have to been done in the tracing clients during export, or by the backends on ingest.
While I recognize that this does have some advantages in regards to semantics and for the data that can be represented, it does introduce complexity into tracing clients and backends. I'm not saying we shouldn't pursue this proposal, but we should discuss what the actual benefits are, and whether the added complexity is worth the tradeoff.
I'd also like to add that with the dotted-string notation, tracing clients can reduce the runtime string allocations to 0 for attribute keys by introducing constants for semantic conventions (and any other commonly used keys). We would lose this ability by changing to nested maps.
I should also clarify that I am completely ok with array support. It's the nested map support that I have reservations about.
@bogdandrutu can you please clarify why is this reopened?
@tigrannajaryan because of the last week discussion and concerns raised by @mwear
Was closing it even intentional? I can't remember any final decision in this issue. At least it's not documented here?
I guess closing this via a commit into some personal repository (tigrannajaryan/exp-otelproto@1507a2f) was accidental? It's surprising to me that GitHub allows this (I would even be tempted to call that a bug).
I guess closing this via a commit into some personal repository (tigrannajaryan/exp-otelproto@1507a2f) was accidental? It's surprising to me that GitHub allows this (I would even be tempted to call that a bug).
Definitely not intentional. I don't understand why does a commit in my personal repo which is not even a fork of this one result in closing an issue here. That's weird. Agree it appears to be a bug.
To clarify: are nested arrays allowed, e.g. [][]int, or only single dimensional arrays e.g. []int
Currently only homogeneous, single-dimensional arrays of primitives are allowed. This issue is about relaxing that. EDIT: Discussion is happening mostly on https://github.com/open-telemetry/opentelemetry-specification/pull/596 recently.
Currently only homogeneous, single-dimensional arrays of primitives are allowed. This issue is about relaxing that.
Great. So my linked PR at the moment meets the spec, but won't once we relax the criteria.
I maintain we should not relax the criteria, but see #596. This has also been discussed in a few SIG Spec meetings.
Corresponding PR didn't recieve a single approve. https://github.com/open-telemetry/opentelemetry-specification/pull/596#issuecomment-673883652 Meaning nobody is seeking for this feature. Moving this issue to Future milestone.
For the record for future discussions: map values and nested values are currently supported at the OTLP protocol level, but are not utilized by the OpenTelemetry API. This issue is to discuss whether we also want to support such values in the API.
Given map values and nested values are supported at the OTLP protocol level, should we treat the current spec of MUST for only primitives and homogeneous lists to be the base requirement and potentially add a MAY for maps and nested values? The expanded types have been working well at a trace level, so strictly implementing the spec would require us to add validations and restrict inputs, which I'm hesitant to add when it's working and we remove the need for added operations.
https://github.com/open-telemetry/opentelemetry-erlang/issues/111
Most helpful comment
Semantically, I agree that the dotted string notation is equivalent to a map, although I'd like to point out, that at least from the tracing client perspective, dotted strings are a slightly more efficient representation.
Consider the following representations for the key-value pair 'http.method': 'GET'.
Dotted string representation
{ 'http.method': 'GET }To represent this we need 1 map and 2 strings; 3 total objects.
Map representation
{ { 'http' : { 'method': 'get' } }This requires 2 maps, 3 strings; 5 total objects.
Furthermore, most tracing backends do not support nested attributes (as far as I know), and will need to flatten them into dotted strings. This is something that will either have to been done in the tracing clients during export, or by the backends on ingest.
While I recognize that this does have some advantages in regards to semantics and for the data that can be represented, it does introduce complexity into tracing clients and backends. I'm not saying we shouldn't pursue this proposal, but we should discuss what the actual benefits are, and whether the added complexity is worth the tradeoff.