Kibana: [APM] Display related trace logs in the trace sample content

Created on 28 May 2020 · 35Comments · Source: elastic/kibana

Blocked: An embeddable log component from Logs team is required.

Summary

Based on the design solution in https://github.com/elastic/apm/issues/179

We're currently linking out to the Logs app to display the related logs for a specific trace.id, but it would be a better experience to show the logs in the context of the trace sample immediately available from within the APM app.

Design proposal

Figma prototype link

Kapture 2020-01-10 at 15 12 01

Slice 1

Screenshot 2020-01-10 at 15 14 39

Possible API

<LogStream 
 timestamp="1590690626648" 
 filter={encodeURIComponent('trace.id:"0570667f4e27e2cac0d6c5b311c65918"')} 
/>

Prerequisites for starting implementation

There's a number of things that can be seen as dependencies for this feature to be implemented and this work should be planned ahead when we prioritize this feature.

A way to display the log events in a similar style as the Logs stream in the Logs app. Currently the log stream is not embedabble.
The embedded view would also need to support actions like "View log line in context" and "View log details" which would link to the Logs app with the relevant views.
Ways of adding our own data columns to the logs viewer e.g. the service legend (see screenshots).

apm v7.11.0

Source

formgeist

❤6

Most helpful comment

👋! We have made a first version of an embeddable logs component in https://github.com/elastic/kibana/pull/76262

Please take a look and play with it. Feel free to ask me any questions or feature requests :)

afgomez on 8 Sep 2020

🎉3

All 35 comments

Pinging @elastic/apm-ui (Team:apm)

elasticmachine on 28 May 2020

Big +1 from me.

I see this seems blocked by what looks like Logs Stream app being embeddable.

@elastic/logs-metrics-ui do we have an issue we're tracking around converting Logs Stream app to an embeddable component?

cc @roncohen @alvarolobato @mukeshelastic @cyrille-leclerc

tbragin on 28 May 2020

do we have an issue we're tracking around converting Logs Stream app to an embeddable component?

Not yet - the approach probably highly depends on how exactly the embedding UI would have to control and inject the source configuration.

weltenwort on 28 May 2020

@weltenwort makes sense. What do you think about a simple first version that takes the same parameters as the link from apm to logs (timestamp and filter)

<LogStream 
 timestamp="1590690626648" 
 filter={encodeURIComponent('trace.id:"0570667f4e27e2cac0d6c5b311c65918"')} 
/>

In the first version I think it's fine if it's just a static number of log lines. If users want to see more lines, they can click "Open in Logs" which will take them to the logs app.

sqren on 28 May 2020

👍1

An alternative approach could be to specify the time range:

<LogStream 
 startTime="1590690626648" 
 endTime="1590690636648" 
 filter={encodeURIComponent('trace.id:"0570667f4e27e2cac0d6c5b311c65918"')} 
/>

sqren on 28 May 2020

@formgeist and I came up with two other contexts (besides per-trace) to consider:

per-transaction logs - similar to marks but with the full abilities of a log statement
per-span logs - common in OT. but without specialized APIs and event types.

In both cases, the data should actually be small so a more minimal UI might be appropriate, handy if an embeddable logging ui component is a ways off. @formgeist is exploring the possibilities in parallel with this issue.

graphaelli on 29 May 2020

❤1

per-span logs

Do we annotate log lines with transaction.id/span.id - I thought we only did this for trace.id ?

sqren on 30 May 2020

The complicated part of embedding the logs is the source configuration, i.e. which log indices to look at and which columns to show. I see two options for how to implement it:

We refactor the log stream (and underlying APIs) to work with a dynamically passed source configuration. This configuration would have to be passed to the embedded log stream component and via the link URL.
APM statically injects a (hidden) source configuration via the server-side plugin contract and references that when embedding and linking to the Logs UI. (This is what stack monitoring already does when linking.)

The former is more powerful but also requires more effort and handling of edge-cases. The latter is more restricted but requires fewer changes on the logs stream side.

weltenwort on 2 Jun 2020

per-span logs

Do we annotate log lines with transaction.id/span.id - I thought we only did this for trace.id ?

In the Java agent, we add transaction.id and trace.id. If possible, I'd like to avoid span-scoped logs (adding span.id to logs) as the performance overhead for that would be significantly higher. Given that the transaction.id and the timestamp should tell you most of the time in which span the log happened, the cost/benefit ratio seems like it's not worth it to me.

felixbarny on 2 Jun 2020

@weltenwort The first approach definitely sounds more appealing to me since it's stateless and therefore typically less error-prone but if the other approach is significantly easier on your part we might have to go that route.

Either way, how can APM know which log indices to link to?

sqren on 3 Jun 2020

Thanks @felixbarny . Not having span scoped logs sounds okay to me.

sqren on 3 Jun 2020

Either way, how can APM know which log indices to link to?

And how can the Logs UI know? One way out might be to rely on the new indexing strategy, from which we could derive a convention about the data type and namespace. Does the APM config have the concept of a namespace?

weltenwort on 3 Jun 2020

One way out might be to rely on the new indexing strategy, from which we could derive a convention about the data type and namespace. Does the APM config have the concept of a namespace?

I haven't spent any time looking into the new indexing strategy for apm. "namespace" is the last part like {type}-{dataset}-{namespace}, right?

sqren on 3 Jun 2020

I would expect APM to use the same log source the logs UI is using - there is a single data source configuration for logs UI right? That is, logs will be in filebeat-* now, logs-* soon, plus the user can customize them.

I'd like to avoid span-scoped logs (adding span.id to logs) as the performance overhead for that would be significantly higher

This is common with Jaeger and through opentracing and users are already doing their own log correlation using span.id so I don't think the two - displaying span scoped logs and logging span.id by agents by default - need to be interdependent. I'd like to see what the design looks like and have a discussion on priorities before discarding this idea.

graphaelli on 3 Jun 2020

👍1

I'm thinking through some options for how we can make this as simple as possible. I don't think most apps in Kibana are going to want logs from specific indices, but rather "just please give me 'the logs' that match these ECS-based and/or time-based criteria" so I think we'll need to be able to accommodate that based on a very simple index-based convention.

jasonrhodes on 2 Jul 2020

so I think we'll need to be able to accommodate that based on a very simple index-based convention.

Does that mean querying the indices that's specified in the Logs Settings already? I think that would be what the user would expect

sqren on 2 Jul 2020

@sqren with the way things are set up today, yes I think that's the sanest and simplest approach. If a user has created a separate space in order to look at a specific subsection of logs with a much more custom source, the logs component may not work as well in that use case, but we may be able to live with that.

I may be forgetting some problems with this approach but I don't think we need to make the indices customizable for a component like this.

jasonrhodes on 2 Jul 2020

I don't think we need to make the indices customizable for a component like this.

Agree, I think it should simply read from the indicies specified in Log Settings. In that case what do you think about a Log component with the following interface?

<LogStream 
 timestamp="1590690626648" 
 filter={encodeURIComponent('trace.id:"0570667f4e27e2cac0d6c5b311c65918"')} 
/>

sqren on 2 Jul 2020

In that case what do you think about a Log component with the following interface?

The log stream is now limited using a time range as well (for performance and uniformity reasons). So we have three timestamps: startTime, endTime and a timestamp for the log line of interest. How about the following semantics:

| startTime | endTime | timestamp |
| --- | --- | --- |
| given | given | endTime |
| timestamp - 1 day | timestamp + 1 day | given |
| given | given | given, but clamped to [startTime, endTime] |

The filter wouldn't have to be URI-encoded if passed as a prop.

For the sake of keeping things organized, could we formulate the requirements as a response to #70513?

weltenwort on 6 Jul 2020

and a timestamp for the log line of interest

What is the purpose of timestamp when startTime and endTime is given? Will lines that occur at that timestamp be highlighted?

The filter wouldn't have to be URI-encoded if passed as a prop.

sgtm 👍

For the sake of keeping things organized, could we formulate the requirements as a response to #70513?

Sure, I'll add that

sqren on 6 Jul 2020

What is the purpose of timestamp when startTime and endTime is given? Will lines that occur at that timestamp be highlighted?

It's the time that logs stream should be scrolled to. A [startTime, endTime] interval of one hour, for example, could contain millions of lines so we'd have to know which time to scroll to initially.

weltenwort on 6 Jul 2020

Okay, makes sense 👍

sqren on 6 Jul 2020

@weltenwort + team: Do you have enough information to have this ready for 7.10? Preferably in time for us to implement it on our side also :)

sqren on 8 Jul 2020

@weltenwort

the approach probably highly depends on how exactly the embedding UI would have to control and inject the source configuration.

Is this comment answer that question ? Specifically:

I think it should simply read from the indicies specified in Log Settings.

sqren on 8 Jul 2020

Do you have enough information to have this ready for 7.10? Preferably in time for us to implement it on our side also :)

It's not on our roadmap so far. @sgrodzicki, @jasonrhodes, will you take that into consideration?

weltenwort on 14 Jul 2020

Okay, this was probably not communicated very clearly from our side. We had hope to have it in for 7.9. Now let's aim for 7.10 if possible.

sqren on 14 Jul 2020

Is this comment answer that question ? Specifically:

I think it should simply read from the indicies specified in Log Settings.

Yes, assuming you're fine with the fact that any changes the user makes in the Logs Settings will impact what is shown:

the indices from which the entries are read
the columns displayed

We should also talk about the feature set of the embedded view:

Does it support streaming? Since the time range is controlled by the embedding component I assume no.
Does it support the additional info flyout with its action links?
Does it support view-in-context?
Does it support filtering and highlighting?

weltenwort on 14 Jul 2020

Yes, assuming you're fine with the fact that any changes the user makes in the Logs Settings will impact what is shown:

Yes, that's what I would expect. Unless you foresee any problems with this approach.

We should also talk about the feature set of the embedded view:

For the mvp I think just having the log lines (with timestamps) will be sufficient. Similar to @formgeist's mockup:

We should add a link so the user can go to the logs app if they want to dive deeper.
Btw. I don't assume we can correlate services to log lines so please ignore that.

For the first version:

Does it support streaming?

Does it support the additional info flyout with its action links?

Does it support view-in-context?

Can we link to the Logs app for this?

Does it support filtering and highlighting?

sqren on 14 Jul 2020

Do you have enough information to have this ready for 7.10? Preferably in time for us to implement it on our side also :)

It's not on our roadmap so far. @sgrodzicki, @jasonrhodes, will you take that into consideration?

Adding this as a consideration for 7.10. I'll follow-up once we agree on the scope.

sgrodzicki on 14 Jul 2020

Would it make sense to add span.id in the log messages emitted by elastic/ecs-logging libraries to have finer correlation between the distributed trace waterfall view and the log message?

CC @felixbarny @alex-fedotyev

Notes

span.idhas just been added to ECS ( https://github.com/elastic/ecs/pull/882 )
Sample log message emitted by ECS Logging 0.4.0 https://gist.github.com/cyrille-leclerc/df77175685fd3dc4c053fe060d04e29c

cyrille-leclerc on 23 Jul 2020

It might but I don't think it's worth it given the alternatives and overhead.

Adding the span ID would significantly increase the overhead, especially if there are many spans (some users might do 100s or 1000s of DB queries per transaction). Depending on the Agent and how it integrates with the loggers, this overhead also occurs if there are no log statements associated with a span.

As we already add the transaciton.id and as we also have the timestamp, we can have a pretty good guess to which span a log belongs.

Which UI features do you have in mind that could benefit from a span.id in the logs? Are you thinking to add a logs tab to spans? If so, would it be good enough to show all transaction logs and to highlight those that were logged within the span's duration?

felixbarny on 23 Jul 2020

FYI we are doing additional design work with @formgeist on this topic.

cyrille-leclerc on 30 Jul 2020

👍1

👋! We have made a first version of an embeddable logs component in https://github.com/elastic/kibana/pull/76262

Please take a look and play with it. Feel free to ask me any questions or feature requests :)

afgomez on 8 Sep 2020

🎉3

Awesome @afgomez ! I'll ping you later this week and get it implemented in APM. Looks like it's going to be very easy. Thanks for doing this!

sqren on 8 Sep 2020

Replaced by https://github.com/elastic/kibana/issues/79995

sqren on 8 Oct 2020