Kibana: [APM] Show Span.sync property in Span detail view

Created on 20 Nov 2018  ·  22Comments  ·  Source: elastic/kibana

Summary

Describe the feature:
Span.sync has been added to apm-server and it shows whether the span is for a synchronous or asynchronous operation. A synchronous operation in this context means an operation that blocks the execution of the thread until it finishes.

This property should be displayed in the Span detail view.

Possibly we can also highlight synchronous spans in the waterfall, however this could depend on the agent, cc @elastic/apm-agent-devs .

Please note that this property is optional and should not be displayed if not available.

Describe a specific use case for the feature:
In frontend applications having synchronous http requests for example is a bad practice and affects the application's performance significantly.

Design solution

We're choosing to highlight these spans both in the Timeline and Span details views. Since languages have different ways of describing this, we will use different label naming.

  1. Display euiBadge euiBadge--warning on the Span in the Timeline visualization after the duration.

01 dt - transaction group detail parent

  1. Display euiBadge euiBadge--warning in the Span details flyout appended the span name.

03 dt - transaction - span flyout

Both implementation are based on the following condition;

  • undefined we don't show anything
  • true we show a "blocking" label
  • false we show an "async" label
apm good first issue

All 22 comments

Pinging @elastic/apm-ui

Just so I fully understand the use-case and how we might want to display this to the user (depending on language). In general span.sync = true is bad, because it means that the span is most likely to block all other operations in the execution. When true we want to display this to the user both on the span in the Timeline, to indicate this behaviour, and in the span details, preferably by using some badge or icon that is both serving as a highlighter and description.

In more visual context, I'm suggesting something along the lines of;

01 dt - transaction group detail parent
03 dt - transaction - span flyout

Not sure what we should label it, so gone with "Sync" for now - I'd rather not have a long name, but we could have a tooltip or title tag that describes the badge in more detail.

Note: please ignore that the two examples are not the same span, and possibly that they're not indicative of "blocking" in the actual timeline visualization. Just for displaying the badge on the span in the details.

cc @elastic/apm-agent-devs

In general span.sync = true is bad, because it means that the span is most likely to block all other operations in the execution

Well, it depends. In some languages like Ruby, sync is the default and not bad. So it might not make sense for @mikker to set this to true for every span. There is however the posibility in Ruby to do async operations, but it's much more rare. As this is very rare, I would suggest that the Ruby agent only used this property in case it was an async span (and in that case set it to false).

In Node.js and the JavaScript RUM agents, you're right that sync is bad. So it really depends on the language I would say.

Not sure what we should label it

I think there was a general consensus that we should have chosen a different name for the property than "sync", but at the time this was brought up to the rest of the agent devs it was too late. This however doesn't hinder us in using a better label.

When sync is true, I would suggest using the label "blocking" instead.

@watson Thanks for feedback - I forgot that Ruby was probably an exception. And I prefer "blocking", it's more descriptive. Still doesn't stop us from adding a title label to describe it in more detail.

I'm not sure Ruby is the only exception. I would think blocking is the default and expected behavior in Java, Go, Python, Ruby and .NET. While all of these languages have been getting abilities to run non-blocking code, only Node.js/JavaScript is non-blocking by default AFAIK.

Even though I'd prefer using the same terminology across agents, it might not feel natural in every case as per @gregkalapos's comment:

for .NET this is very useful, probably naming it ‘async’ would be more natural for .NET, but ‘sync’ or even ‘blocking’ is ok to me.
https://github.com/elastic/apm-dev/issues/328#issuecomment-440975935

@sqren Yeah, I guess that's what I was trying to say (but probably not in an easy to understand way 😅).

My dream implementation is:

  • For languages where sync (aka blocking) is the norm, I think the label should be "async" when sync: false
  • For languages where async (aka non-blocking) is the norm, I think the label should be "blocking" when sync: true

Won't it be confusing to the user if you have a trace with multiple services using different languages and the label is different?

Won't it be confusing to the user if you have a trace with multiple services using different languages and the label is different?

True. Didn't think about that.

@formgeist that's a valid concern. I'm not sure. I think it might work ok, but it's hard to be sure about these things.

Actually, even with that caveat I still prefer the labels "async" + "blocking". I think I would understand that if I used the UI without knowing the back story.

@sqren How will you keep track of which languages should display what?

@sqren Apologies, I missed @watson definition above 😅

How will you keep track of which languages should display what?

Would have to be a hardcoded list in the ui. So not so nice. It also wouldn't work for third party agents (we'd have to take a guess on which terminology to use if agent is not in the list).

So if we can do as @watson suggests ("async" + "blocking"), that would be much better.

As mentioned above, to avoid noise agents should only set sync property when it is out of the ordinary. So nodejs agent should only set sync=true (because it's normally async).

To summarize my thoughts. When sync is:

  • undefined we don't show anything
  • true we show a "blocking" label
  • false we show an "async" label

Updated the description with visual implementation examples and the conditions for the label 👍 Thanks for the help getting this figured out. So let's see what this looks like in trace timelines and whether we want to alter the placements.

@formgeist I like. That way the UI doesn't have to make custom code for each agent. Instead the agent chooses if the label should be displayed or not by either including the property in the payload or not 😃

Thanks @formgeist! Looks great 🎉

@jahtalab It doesn't look like any of the opbeans applications are sending spans as sync: true - I could only find sync: false.

As a rule of thumb I try to defer implementation until we have adequate data. I'll mark this as "blocked" until we have adequate data.

With this PR some spans will have sync: true for the opbeans-frontend!

Awesome, thanks @jahtalab !

@formgeist what color are we using for the badge?

@brittanyjoiner15 I think the original designs called for using the warning color that can be passed to the EuiBadge, but looking closer at it now, I think we should go with the default instead. We already have a lot of different indication colors in the Timeline. Thanks for the taking this on 👍

Was this page helpful?
0 / 5 - 0 ratings

Related issues

socialmineruser1 picture socialmineruser1  ·  3Comments

MaartenUreel picture MaartenUreel  ·  3Comments

ctindel picture ctindel  ·  3Comments

timroes picture timroes  ·  3Comments

timroes picture timroes  ·  3Comments