Jaeger: Support for Agent level tag

Created on 18 May 2018  路  16Comments  路  Source: jaegertracing/jaeger

Requirement - what kind of business use case are you trying to solve?

I think it would be nice to be able to add tag at the Agent level. It would be similar to adding the tag at the Tracer level via JAEGER_TAGS environment variable

Here's some example of the tags we like to be able to inject into each traces:

  • AWS Instance metadata (similar to https://github.com/takus/fluent-plugin-ec2-metadata
  • Kubernetes metadata (similar to https://fluentbit.io/documentation/0.13/filter/kubernetes.html)
  • IP address (if you run the Agent and the application with the Tracer in different container, they will likely have different IP address)

Problem - what in Jaeger blocks you from solving the requirement?

I don't there's anything blocking this, it's a feature request

Proposal - what do you suggest to solve the problem or improve the existing situation?

Ability to provide Jaeger Agent with tags at startup (via something similar to JAEGER_TAGS)

Any open questions to address

  • Should these tags be stored and displayed like the Tracer tags? (under the Process section in the Query UI)

_edited to remove the plugin proposition and to reflect the latest state of the request_

Most helpful comment

An interesting blog past about something similar to what I'm looking to achieve (kubernetes pod metadata) : https://developers.soundcloud.com/blog/using-kubernetes-pod-metadata-to-improve-zipkin-traces

All 16 comments

I'm a bit unclear on the uses case here

For e.g.

if you run the Agent and the application with the Tracer in different container, they will likely have different IP address

If this is true, then wouldn't all agent tags be incorrect for the span?

It would be similar to adding the tag at the Tracer level.

Hypothetically, if these tags were to be added at the Tracer level, does it solve your problem?

I think the following information we would be useful to be attached to the traces (for performance but also root cause analysis):

AWS:

  • Instance type
  • Instance ID
  • AZ
  • Private IP (different than the IP you can get inside a container unless you use --net=host)
  • EC2 tags (we use tags a lot internally so that's a valuable piece of information)

Note: The EC2 tags can change without any reboot

Kubernetes (we are running the Agent as a sidecar currently):

  • Pod Name
  • Namespace
  • Pod Labels
  • Pod Id

Note: This could easily be achieved by injecting environment variable using the Downward API

Docker (of the Jaeger Agent container):**

  • Container tag (Is it using latest? Or is it still hardcoded to 0.6? etc.)
  • Container sha (because a tag can change overtime)

Why in the Agent?

Mostly because you only need to code it once and not in each Client libraries.
But here's a few other reasons:

  • In Kubernetes, you would only need to mount the ServiceAccount in the Agent container
  • If you run multiples containers in AWS, it would reduce the number of API requests (one per host instead of one per instrumented container)

On the other hand, here's a problem I see if it's done in the Agent:

  • If you run the Agent as a Kubernetes DaemonSet, there's no way to extract the application pod information (because the Agent doesn't really know where the data came from)

+1 to this feature request. I see this as a natural extension to the JAEGER_TAGS env var as seen in the Go and Java clients.

Plugins support for AWS/Kubernetes/etc tagging

I'm -1 on this, though, as I think it's an unnecessary complexity. I think the idea for it is to support changes in EC2 tags without restarting the main process. If that's the reasoning, I would prefer to see the Agent (or any other service) to properly handle SIGHUP, to reload the configuration.

Why in the Agent?
Mostly because you only need to code it once and not in each Client libraries.

@ledor473 I feel that using the JAEGER_TAGS env var could be a potential solution, you can have an independent process that injects whatever is required into this. Are there difficulties doing this?

I see this as a natural extension to the JAEGER_TAGS env var as seen in the Go and Java clients.

@jpkrohling I'm not convinced that applying these tags for spans reported by a given agent adds a lot of value over setting these environment variables and having the clients inject them into spans.

I'm not against the idea of enriching span data with information from the client, perhaps in the form of tags, but I think that the existing mechanism is good enough for the use case. (We should standardize across all official clients)

I'm not convinced that applying these tags for spans reported by a given agent adds a lot of value over setting these environment variables and having the clients inject them into spans.

On environments making use of containers, the container might not have access to the entire "context", like what's the actual node it's running on, or the DC/Rack. The agent might have this info, as it would potentially be running directly on the node (bare metal/VM), and not inside a container.

I think it's fine to only support environment variable and ignore the plugin part of the request.
I can easily be handled by another process outside of the Agent which is totally fine for me.

Still, I don't think it's the same feature as JAEGER_TAGS environment variable.
The version of the Jaeger Agent can't easily be injected into every client for instance. It _could_ be shared between the Agent and the Client using the Control Flow endpoint, but it's not possible right now.

I'm a bit confused. If the agent is running as a daemon on the host, it by definition cannot have enough context to enrich the spans from a given service, because it is shared with other services. And if it is running as a sidecar, then I am not sure why it matters who gets the desired metadata via env, the sidecar or the main process.

If the agent is running as a daemon on the host, it by definition cannot have enough context to enrich the spans from a given service

The agent wouldn't care what's inside the spans and which service is submitting it, just that they all should contain the tag tenant: xyz.

I have no objections to supporting something like JAEGER_TAGS in the agent.

Implementing the equivalent of JAEGER_TAGS in the Agent would work for our use case. I can edit the initial post to remove the "plugin" idea and focus around Agent-level tags.

The open question is still valid. It will help guide the implementation of it

Re open question - yes, these should go into tracer tags imo.

Any progress on this? Our use-case is that we have several clusters runnings the same applications.

We'd like to filter by cluster, but adding that information into each application config seems non-ideal and error-prone.

I quickly look into implementing this but I couldn't find any easy place to hook into.
Unless I've missed something (totally possible as I've never looked at Thrift), the Agent seems to only forward/proxy what it receives to the Collector (with a buffer).

the Agent seems to only forward/proxy what it receives to the Collector (with a buffer).

pretty much. But we can add an enrichment step that will add some tags to the Process, so that we wouldn't need to change the data model.

An interesting blog past about something similar to what I'm looking to achieve (kubernetes pod metadata) : https://developers.soundcloud.com/blog/using-kubernetes-pod-metadata-to-improve-zipkin-traces

Note: #1396 implements agent tags, but only for gRPC reporting path (not for TChannel, which we're looking to deprecate).

Was this page helpful?
0 / 5 - 0 ratings

Related issues

yurishkuro picture yurishkuro  路  4Comments

vprithvi picture vprithvi  路  3Comments

albertteoh picture albertteoh  路  3Comments

rur0 picture rur0  路  4Comments

pavolloffay picture pavolloffay  路  3Comments