Kibana: [APM] Track OpenTelemetry usage

Created on 26 Aug 2020  路  12Comments  路  Source: elastic/kibana

We are currently only checking statistics for Elastic APM agents. We should add telemetry to gauge usage of OpenTelemetry agents.

These are the known names of OpenTelemetry agents:

opentelemetry/cpp
opentelemetry/dotnet
opentelemetry/erlang
opentelemetry/go
opentelemetry/java
opentelemetry/nodejs
opentelemetry/php
opentelemetry/python
opentelemetry/ruby
opentelemetry/webjs

Ideally we collect the same telemetry as we do for other agents. This could be a pretty big increase in no of fields collected, so we should also check with the Telemetry team.

TODO:

  • [ ] opentelemetry sample data
  • [ ] Talk to telemetry team and verify that we are allowed to add more fields to the telemetry mapping
apm v7.10.0

Most helpful comment

I think that key questions we want to answer using telemetry are following (prioritized top down):

  1. How many customers are adopting OpenTelemetry.
  2. Which OpenTelemetry agents are getting adopted and how many services are instrumented and reporting to Elastic.
  3. What versions of the OpenTelemtry agents are adopted (especially interesting to get that data when OTel goes GA).
  4. What runtimes and frameworks are instrumented with OTel.

@dgieselaar - take a look at a sample transaction collected by OpenTelemetry via Elastic exporter.
https://gist.github.com/cyrille-leclerc/7af379fffbd80e072b860d4eb213e27e

It looks like more of the data fits into the model how we report data today.
For example, we could extend "services_per_agent" values to accommodate more types:

"services_per_agent": {
  "python": 1,
  "dotnet": 2,
  "java": 10,
  "rum-js": 0,
  "go": 0,
  "nodejs": 0,
  "js-base": 0,
  "ruby": 0,
  "opentelemetry-cpp": 1,
  "opentelemetry-dotnet": 1,
  "opentelemetry-erlang": 1,
  "opentelemetry-go": 1,
  "opentelemetry-java": 1,
  "opentelemetry-nodejs": 1,
  "opentelemetry-php": 1,
  "opentelemetry-python": 1,
  "opentelemetry-ruby": 1,
  "opentelemetry-webjs": 1
}

Similar with agent versions:

"opentelemetry-java": {
                "agent": {
                  "version": [
                    "0.7.0"
                  ]
                }

I am not certain if runtime version is reported by OTel, and whether framework name as "io.opentelemetry.auto.servlet-3.0" is useful for us.
But if we get what I outlined above, it would be very good start.

All 12 comments

Pinging @elastic/apm-ui (Team:apm)

Let's look into Jaeger agents as well.

Please note that Jaeger agents are less critical to us as Jaeger is embracing OpenTelemetry.

I think that key questions we want to answer using telemetry are following (prioritized top down):

  1. How many customers are adopting OpenTelemetry.
  2. Which OpenTelemetry agents are getting adopted and how many services are instrumented and reporting to Elastic.
  3. What versions of the OpenTelemtry agents are adopted (especially interesting to get that data when OTel goes GA).
  4. What runtimes and frameworks are instrumented with OTel.

@dgieselaar - take a look at a sample transaction collected by OpenTelemetry via Elastic exporter.
https://gist.github.com/cyrille-leclerc/7af379fffbd80e072b860d4eb213e27e

It looks like more of the data fits into the model how we report data today.
For example, we could extend "services_per_agent" values to accommodate more types:

"services_per_agent": {
  "python": 1,
  "dotnet": 2,
  "java": 10,
  "rum-js": 0,
  "go": 0,
  "nodejs": 0,
  "js-base": 0,
  "ruby": 0,
  "opentelemetry-cpp": 1,
  "opentelemetry-dotnet": 1,
  "opentelemetry-erlang": 1,
  "opentelemetry-go": 1,
  "opentelemetry-java": 1,
  "opentelemetry-nodejs": 1,
  "opentelemetry-php": 1,
  "opentelemetry-python": 1,
  "opentelemetry-ruby": 1,
  "opentelemetry-webjs": 1
}

Similar with agent versions:

"opentelemetry-java": {
                "agent": {
                  "version": [
                    "0.7.0"
                  ]
                }

I am not certain if runtime version is reported by OTel, and whether framework name as "io.opentelemetry.auto.servlet-3.0" is useful for us.
But if we get what I outlined above, it would be very good start.

@alex-fedotyev looks clear to me. Should we aim this at 7.10? @sqren

Let's aim for 7.10.

Thank you very much @dgieselaar and @sqren . this will greatly help us understand the adoption of OpenTelemetry with Elastic Observability. cc @axw

@sqren , @dgieselaar - I checked with telemetry team.
They are OK with adding new fields to the existing collector, schema mapping and index in 7.10.

Telemetry-Next will be used for the future, and I suggest we start designing what we what to report in 7.11!

@alex-fedotyev do you happen to have a link to and example I can run using an otel agent so we can test out the changes?

@smith https://github.com/elastic/apm-contrib/tree/master/opentelemetry includes Python and Node.js apps instrumented with OTel. Let me know if you have any troubles getting it running.

Thanks @axw !

@smith
As I just learned, you can also change this file https://github.com/elastic/apm-contrib/blob/master/opentelemetry/collector-config.yml to point it to your APM server if needed.

Here is how configuration needs to look like in that case:

exporters:
    elastic:
        apm_server_url: "https://elasticapm.example.com"
        secret_token: "hunter2"

@smith , copying here request from @mindbat to make sure we don't miss it, thank you!

For the 7.10 changes, can you open a separate mapping update request, with a list of fields, etc, so we can start planning out that work?

Was this page helpful?
0 / 5 - 0 ratings