Kibana: [Observability] Shared ECS fields across solutions (Uptime, APM, Logs, Metrics)

Created on 18 Jun 2020  路  17Comments  路  Source: elastic/kibana

For the Observability Landing Page we will be displaying visualisations from all the observability solutions. In the first version the only way to filter these is via the time picker but for the next iteration it should be possible to filter by a lot more dimensions.

image
_The Observability Landing page will have a search bar that allows the user to filter the data on the page_

Search bar in each solutions
Many solutions already allow users to filter by any field in their mapping through the search bar. With the autocompletion it provides good UX for discoverability of interesting fields and values, and let them slice and dice their data to their needs.

The problem
A search bar that spans across all observability solutions poses some challenges, primarily which fields the user should be able to filter by.
One approach is to suggest all fields across the observability solution mappings. This would be a list of 100s of fields where most are specific to a single solution, so querying any of them would render 3 out of 4 visualisations empty. It would be very difficult for the user to find the few fields that are actually useful when filtering and correlating data across multiple observability solutions.

Suggestion: a curated list of handpicked fields
To avoid overloading the user with 100s of useless fields, we should aim for a curated list. The fields will be handpicked by us based on which are most useful and standardised across most (or all) obs solutions.

Questions

  • Which fields are already standardised across observability?
  • Which fields are important to each observability solution and not yet standardised?

Fields in ECS

These are fields that are important to at least one of the Obs solutions and are already in ECS. Having it in ECS is only the first step towards standardising across teams. Eg. service.name is in ECS but it is not populated by beat agents.

Fields not in ECS

These are the fields that are important to a solution but not yet in ECS nor standardised across all solutions.

Observability apm logs-metrics-ui uptime enhancement

Most helpful comment

Linking the work on Infrastructure inventory common schema here for visibility as it's relevant.
cc @exekias @kaiyan-sheng

All 17 comments

Pinging @elastic/uptime (Team:uptime)

Pinging @elastic/apm-ui (Team:apm)

Pinging @elastic/logs-metrics-ui (Team:logs-metrics-ui)

Big +1 on this effort. @cyrille-leclerc @mukeshelastic this is related to our discussion last week. Hopefully we can help our teams identify the most important fields to correlate information.

Might want to consider the dataset.* fields from Elastic Agent. These will include type, name, namespace (and potentially category).

@mostlyjason I found the field event.dataset in ECS (ECS event namespace) and the fields service.name and service.type (ECS service namespace) but I didn't find a dataset namespace in ECS (didn't find it either looking at events generated by Filebeat, Metricbeat) or APM.
Could you please you give an example or link to docs?

Looking forward to see what you come up with.

If any of the fields selected aren't in ECS, please open an issue to have them added! The common fields should be in ECS. When something's not in ECS, it only means it's not in ECS yet :-)

In a recent unrelated discussion I was suggesting the following high level fields:

  • event.module and event.dataset together (the "module.dataset" convention isn't always followed in the event.dataset field)

    • Note: the dataset.* fields may supersede the above, but for 7.x the above is still relevant

  • event.kind and event.category if they're filled by observability
  • perhaps agent.type is interesting as well?

@mostlyjason: dataset.* is not a field APM produces or consumes but I imagine that it could be relevant for Logs or Metrics?
Is it a field the APM agents should be populating to create parity across the solutions?

We have an open PR to add the dataset.* fields to ECS https://github.com/elastic/ecs/pull/845. They are not generated by Beats currently, only by Elastic Agent because they are part of the new indexing strategy. They are used to indicate the type of data being sent, the dataset (diskio, cpu, nginx access, etc.) and the namespace which is a user-defined string to partition data how they like. They use the type constant_keyword to allow for fast searching.

@sqren I hope that when APM is integrated with Elastic Agent that it will begin using the new indexing strategy. We should plan out a migration path to avoid user impact.

@ruflin's talk at GAH is a good intro to the new indexing strategy. Please note that we renamed our fields from stream.* to dataset.*

@sorantis FYI

Linking the work on Infrastructure inventory common schema here for visibility as it's relevant.
cc @exekias @kaiyan-sheng

Note that the standardisation in ECS of the field service.environment is also tracked on https://github.com/elastic/apm-server/issues/3875 and is requested with https://github.com/elastic/ecs/pull/891

FYI RFC to standardise the environment field is in progress: https://github.com/elastic/ecs/blob/master/rfcs/text/0002-rfc-environment.md

@cyrille-leclerc in the opening you say "curated by us". I would suggest "started by us and customizable by the user". There will always be cases where a customer extends ECS for their business purposes and they should be able to also add to the dashboard.

I'll just note that none of the proposed fields really work with uptime except service.name (and then only when configured correctly). It's not entirely surprising given that heartbeat sits external to a server, and thus lacks much data.

@sqren is this still an open issue that needs work for 7.10?

@paulb-elastic I removed the 7.10 label since it's not tied to any release and this is not actively being worked on. I'll leave it open since this is something we still need.

Was this page helpful?
0 / 5 - 0 ratings