This came up when I suggested the handler name instead of the route name for the HTTP semantic conventions (#548). I think that the handler name is a better grouping key than the route, but the route is a better display name, as @yurishkuro pointed out. So I think we should relieve span name from this double duty. I'll post ideas as separate comments to allow reactions.
EDIT:
Cf. current spec of span name https://github.com/open-telemetry/opentelemetry-specification/blob/e7fe34ad/specification/trace/api.md#span:
The span name is a human-readable string which concisely identifies the work represented by the Span, for example, an RPC method name, a function name, or the name of a subtask or stage within a larger computation. The span name should be the most general string that identifies a (statistically) interesting class of Spans, rather than individual Span instances. That is, "get_user" is a reasonable name, while "get_user/314159", where "314159" is a user ID, is not a good name due to its high cardinality.
I used the term "grouping key" here because I think using the span name as a key to group / sample similar spans is the reason why a Span name should have a low cardinality.
1) Add a new field "GroupingKey" to the span that defines a grouping key. The span name should be used if this is not set.
2) Add a semantic convention for a span attribute grouping_key.
3) Define a resource semantic convention semconv.isgroupingkey:<ATTRIBUTE_NAME>=true that defines a given span attribute as being viable as a grouping key (Problem: what if multiple exist? Have the value be an integer priority?).
I forgot to mention this in the SIG meeting today, but this topic was talked about there, so it seems this is an actual issue.
Can you define what a grouping key is? I'm not familiar with this term.
@jkwatson Thank you for that question, I think I accidentally coined that term here. I edited the description to hopefully be more clear. The "grouping key" requirement can be found in the spec as a requirement for low cardinality of the span name. I think the intention here was to allow grouping / sampling per span name.
@jkwatson You can think of the proposed grouping key as what an SQL query will look at in a "GROUP BY" clause. All spans with the same grouping key will be grouped together on the backend when being analyzed or presented in the UI.
Eventually, however, it is up to the backend to decide in which cases and for which purposes it applies grouping at all. It could also decide to group using another logic or even custom defined rulesets instead of relying on the grouping key (or span name).
Thanks for the answers, that helps a lot!
I forgot to mention this in the SIG meeting today, but this topic was talked about there, so it seems this is an actual issue.
@Oberon00 can you elaborate? I am not seeing an issue yet, especially since "grouping key" is a concept for machine processing, and why would machines care if they group by extra field or by readable span name (providing there's uniqueness).
@yurishkuro
"grouping key" is a concept for machine processing, and why would machines care if they group by extra field or by readable span name
Exactly! That's the idea behind my suggestions in this issue.
I am not seeing an issue yet
You can see the issue in the PR comment linked in the description (https://github.com/open-telemetry/opentelemetry-specification/pull/548#discussion_r404397569; actually it was your comment 😄). The issue is that if we just want the span name to be a display name, we can drop the requirement for low cardinality and would potentially have a better human-readability (e.g. we could use GET domain.example.com as display name instead of just HTTP GET), or we could drop the requirement for (easy) human readability and would have a better grouping key. The span name needs to satisfy both requirements which obviously means there must be compromises made in some cases.
Sorry, but I don't see "low cardinality" and "human readable" as conflicting requirements. In Jaeger UI, for example, you can select the Service from a dropdown, and the following dropdown is populated by span names - they have to be _both_ human readable and low cardinality (high cardinality will explode the dropdown and make it unusable).
"HTTP GET" is a very poor display name (it's also a poor grouping key). "GET domain.example.com" is a good display name, but it's an even worse grouping key.
I think this clearly demonstrates that there are cases where this conflicts.
If you want to use your groups for display (as opposed to sampling), you'll need a display name for your group. In that case you can use any one display name from within that group and display something like "'GET domain.example.com' and 32323 similar spans"
To summarize, you dislike the use of span name being both (1) the name for the individual span "instance" and (2) the (low-cardinally) name of the span "class".
I agree. If I'm looking at a single trace, I'd prefer to see
HTTP GET /accounts/123
HTTP GET /users/123
HTTP GET /users/456
HTTP GET /users/789
instead of
HTTP GET /accounts/{accountId}
HTTP GET /users/{userId}
HTTP GET /users/{userId}
HTTP GET /users/{userId}
The former is a more informative summary of the span.
Of course the latter form still needed for grouping (e.g. Jaeger dropdown UI).
Counterargument:
Span attributes are the place to store instance-specific details.
Counter-counterargument:
There are more attributes than can be displayed for all spans in a trace, and there are more span attributes than every backend can be required to understand significant vs. insignificant. If you want something looks like the former trace, you need the instrumentation to name it.
For compatibility I propose the opposite: the span "name" should continue to be the low-carnality span "class." And there should be a standardized attribute name or instance that summarizes the high-carnality span "instance."
And there should be a standardized attribute name or instance that summarizes the high-carnality span "instance."
I think having an optional display.name attribute is a reasonable idea (I've seen it being proposed in some Jaeger tickets), but it's certainly not a required field. Tracing UIs have wide flexibility in how they want to display the spans, e.g. many APM vendor tools prominently show the HTTP status code without having to drill down into span details, but surely we're not suggesting that status code should be a recommended part of the display name. Because each span already captures a lot of other attributes, UIs can be intelligent enough to decide which display format is the most useful, and it would not require the application code to duplicate the information it already provides in the span.
Tracing UIs have wide flexibility in how they want to display the spans, e.g. many APM vendor tools prominently show the HTTP status code without having to drill down into span details
My same feeling.
I like the idea of having an additional, optional attribute (such as display.name) that is defined in the semantic conventions, and that UIs may want to use for display instead of Span name.
optional
Agreed. No reason to make it mandatory: there is an abundently obvious fallback in span name.
To add a further example for this idea:
Since HTTP clients don't have to pattern-match URLs, they frequently are left without a great low-cardinality names, and the instrumentation is forced to choose something poor like HTTP GET or HTTP GET github.com. The marginal utility of an instance-specific name is large in this case.
Bike shed: The thing I don't like about display.name is "display;" the instrumentation shouldn't be defining display.color or display.lines or such.
So back to the initial description: Will this display.nane allow me to merge #548 then?
"HTTP GET" is a very poor display name (it's also a poor grouping key). "GET domain.example.com" is a good display name, but it's an even worse grouping key.
I think this clearly demonstrates that there are cases where this conflicts.
Why "GET domain.example.com" is a bad grouping key?
My $0.02 was given verbally in the last SIG meeting. I'd like to see a span name with message-formatting syntax, like "GET {http.url}". This is both descriptive and low cardinality, and when it comes time to display, the system can format the message.
If you care for one more opinion: span name is certainly a "grouping key" for me. A value that define a "class" of spans, as specification states. It is certainly has less information that possible display name of a single span. Every UI vendor can may his own decisions (probably even different ones for different users of that UI) what specific attribute of the span to use in their UI. Span name defines a unit of work, not a particular invocation of that work.
Taking that into account I think the original proposal of #548 is a good one and we should go with it.
Span name is definitely a grouping key.
The question is whether backends or instrumentations are in a better
position to display a good name/summary for an individual span.
I believe instrumentations should do choose attributes for a display name;
you believe that backends should understand attributes and generate a
display name.
On Mon, May 11, 2020, 13:03 Nikita Salnikov-Tarnovski <
[email protected]> wrote:
If you care for one more opinion: span name is certainly a "grouping key"
for me. A value that define a "class" of spans, as specification states. It
is certainly has less information that possible display name of a single
span. Every UI vendor can may his own decisions (probably even different
ones for different users of that UI) what specific attribute of the span to
use in their UI. Span name defines a unit of work, not a particular
invocation of that work.Taking that into account I think the original proposal of #548
https://github.com/open-telemetry/opentelemetry-specification/pull/548
is a good one and we should go with it.—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/open-telemetry/opentelemetry-specification/issues/557#issuecomment-626896030,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAKWTB4XOD7AA3XEKTDDTYLRRBDYNANCNFSM4MER6NXQ
.
Correct. Backends have to understand attributes anyway if they hope to present useful information to their users. "Useful" as in "relevant to users' use-cases". You don't make decisions about UI in sql queries.
@iNikem
Why "GET domain.example.com" is a bad grouping key?
Because it can have a very high cardinality. See https://github.com/open-telemetry/opentelemetry-specification/issues/270#issuecomment-539363629 EDIT: And https://github.com/open-telemetry/opentelemetry-specification/pull/416/files#r369591004
Ah, I thought you were talking about "GET
Btw, @Oberon00 , adding to my comment above. I think that "GET domain.example.com" is OK for span name and consequently for grouping key, if it really contains only domain name, without any path info. I assume any given application queries only a fixed number of domains :)
And I again want to express my support for #548. What else has to be done before that PR can be revived and merged?
We discussed this the java sig meeting today... My take away from that discussion is that the span name should be the low cardinality part, but we add an optional well-defined attribute like description or detail that can be the high cardinality representation like sql query or http url. This would allow the span name to be used in metrics, while still allowing more interesting visualization when inspecting a specific trace.
At this stage of the development lifecycle, it seems less invasive adding a semantic attribute than an additional required parameter, especially for many cases there won't be a high cardinality name (eg class/method name).
And I again want to express my support for #548. What else has to be done before that PR can be revived and merged?
For the record, I have since changed my opinion and think that http route information, if available, makes for a better span name than handler name. My current opinion is based on the following example.
Imagine we have a servlet container running Spring MVC application and having auto instrumentation agent attached. When we create SERVER span from servlet request we have full url, which is too high cardinality for span name, and servlet handler name, like MainServlet.doGet. Later Spring MVC kicks in and detects that requests should be handled to a PersonController.showDetails based on the request route /api/v2/persons/{personId}.
In the situation above I think it is sensible to update SERVER span name to route-based name /api/v2/persons/{personId}. But not PersonController.showDetails because that would be strange to servlet instrumentation created span.
Hi all, I've made a display hint proposal to help resolve this, please have a look: https://github.com/open-telemetry/opentelemetry-specification/pull/730
"HTTP GET" is a very poor display name (it's also a poor grouping key).
I want to address this argument. For server-side span, there is no excuse in using "HTTP GET" as a span name. A server always has some kind of mapping from URLs to handlers, that mapping most commonly called "route". The route is good both as display string and grouping key, satisfying the span name requirements.
The situation where "HTTP GET" names become necessary are for the client-side spans emitted by generic instrumentation of HTTP frameworks, which are removed from business logic and only have the (potentially high-cardinality) URL to work with. It is often not that interesting to build aggregations of those spans, so while certainly a poor grouping key those names do not affect the analysis that much. Without aggregation requirement, these spans are only useful to be shown in the Gantt-chart style views, where we're dealing with a single span and the UI can replace/augment the span name with the full URL.
I think that depends on how you choose to do it.
IMO there isn't a fundamental difference between client tracing and server tracing:
client.programming.function
HTTP GET client
HTTP GET server
server.programming.function
What is instrumented by whom depends.
If you are using a SOAP client, there an obvious automatic name for the operation. If you are using ApacheHttpClient you will have to choose and add the name yourself.
If you are using a Play server, there is an obvious automatic name for the operation. If you are using com.sun.net.httpserver.HttpServer you will have to choose and add the name yourself.
@open-telemetry/dotnet-approvers are you aware of this issue? It seems you already made a choice here that runs counter the current direction of this discussion when naming the Name DisplayName (see https://github.com/open-telemetry/opentelemetry-specification/pull/730#issuecomment-673882556)
I suggest to move this issue for Future milestone. This may be added after GA, not a breaking change.
The issue here is that the purpose of the existing span name is not clearly defined. Sure, you can add a display.name later on (should it be required). But you cannot as easily undo the compromises made in semantic conventions to make the existing span name a more useful display name and a worse grouping key for that.
See spec
The span name is a human-readable string which concisely identifies the work represented by the Span, for example, an RPC method name, a function name, or the name of a subtask or stage within a larger computation. The span name should be the most general string that identifies a (statistically) interesting class of Spans, rather than individual Span instances.
the issue is that optimizing a span name for human-readability will sometimes yield different results from optimizing for "most general string", as demonstrated in #548. In that PR, I agree that when "naming" an HTTP span, the route name is better for human consumption, but I maintain that the handler name is a more general string and still identifies a statistically interesting class of spans.
Outside of the spec, the span name is generally referred to as the 'grouping key' for operations and instrumenters are advised to avoid high cardinality span names. I agree with your (@Oberon00) overall point, but I would suggest that #548 could just as simply be resolved by turning the handler class/method name into a semantic attribute on a server span.
+1 to Sergey's point about tabling this until post-GA.
@Oberon00 this problem exists on many platforms already and every platform takes its route to resolve it. OpenTelemetry semantic convention made emphasize on (more) guaranteed grouping. Every app owner/vendor may adjust the span name with custom modules or custom code.
Also our requirement is to have a support for features in some backend. This is not a simple attribute and additional semantic is forced on it. So it would be great to have Jaeger or other backend to demonstrate how to use it.
I'm perfectly fine with tabling #548 and #730, but this issue is not resolved by either IMHO.
So it would be great to have Jaeger or other backend to demonstrate how to use it.
How to use what?
I think I'll propose a one sentence PR for this issue before more confusion arises 😉
I proposed #810 to clarify what I think we should resolve before GA.
Most helpful comment
We discussed this the java sig meeting today... My take away from that discussion is that the span name should be the low cardinality part, but we add an optional well-defined attribute like
descriptionordetailthat can be the high cardinality representation like sql query or http url. This would allow the span name to be used in metrics, while still allowing more interesting visualization when inspecting a specific trace.At this stage of the development lifecycle, it seems less invasive adding a semantic attribute than an additional required parameter, especially for many cases there won't be a high cardinality name (eg class/method name).