Jaeger: Support for Agent level tag

Created on 18 May 2018 · 16Comments · Source: jaegertracing/jaeger

Requirement - what kind of business use case are you trying to solve?

I think it would be nice to be able to add tag at the Agent level. It would be similar to adding the tag at the Tracer level via JAEGER_TAGS environment variable

Here's some example of the tags we like to be able to inject into each traces:

AWS Instance metadata (similar to https://github.com/takus/fluent-plugin-ec2-metadata
Kubernetes metadata (similar to https://fluentbit.io/documentation/0.13/filter/kubernetes.html)
IP address (if you run the Agent and the application with the Tracer in different container, they will likely have different IP address)

Problem - what in Jaeger blocks you from solving the requirement?

I don't there's anything blocking this, it's a feature request

Proposal - what do you suggest to solve the problem or improve the existing situation?

Ability to provide Jaeger Agent with tags at startup (via something similar to JAEGER_TAGS)

Any open questions to address

Should these tags be stored and displayed like the Tracer tags? (under the Process section in the Query UI)

_edited to remove the plugin proposition and to reflect the latest state of the request_

Source

ledor473

Most helpful comment

An interesting blog past about something similar to what I'm looking to achieve (kubernetes pod metadata) : https://developers.soundcloud.com/blog/using-kubernetes-pod-metadata-to-improve-zipkin-traces

ledor473 on 19 Sep 2018

👍2

All 16 comments

I'm a bit unclear on the uses case here

For e.g.

if you run the Agent and the application with the Tracer in different container, they will likely have different IP address

If this is true, then wouldn't all agent tags be incorrect for the span?

It would be similar to adding the tag at the Tracer level.

Hypothetically, if these tags were to be added at the Tracer level, does it solve your problem?

vprithvi on 23 May 2018

I think the following information we would be useful to be attached to the traces (for performance but also root cause analysis):

AWS:

Instance type
Instance ID
AZ
Private IP (different than the IP you can get inside a container unless you use --net=host)
EC2 tags (we use tags a lot internally so that's a valuable piece of information)

Note: The EC2 tags can change without any reboot

Kubernetes (we are running the Agent as a sidecar currently):

Pod Name
Namespace
Pod Labels
Pod Id

Note: This could easily be achieved by injecting environment variable using the Downward API

Docker (of the Jaeger Agent container):**

Container tag (Is it using latest? Or is it still hardcoded to 0.6? etc.)
Container sha (because a tag can change overtime)

Why in the Agent?

Mostly because you only need to code it once and not in each Client libraries.
But here's a few other reasons:

In Kubernetes, you would only need to mount the ServiceAccount in the Agent container
If you run multiples containers in AWS, it would reduce the number of API requests (one per host instead of one per instrumented container)

On the other hand, here's a problem I see if it's done in the Agent:

If you run the Agent as a Kubernetes DaemonSet, there's no way to extract the application pod information (because the Agent doesn't really know where the data came from)

ledor473 on 23 May 2018

+1 to this feature request. I see this as a natural extension to the JAEGER_TAGS env var as seen in the Go and Java clients.

Plugins support for AWS/Kubernetes/etc tagging

I'm -1 on this, though, as I think it's an unnecessary complexity. I think the idea for it is to support changes in EC2 tags without restarting the main process. If that's the reasoning, I would prefer to see the Agent (or any other service) to properly handle SIGHUP, to reload the configuration.

jpkrohling on 24 May 2018

Why in the Agent?
Mostly because you only need to code it once and not in each Client libraries.

@ledor473 I feel that using the JAEGER_TAGS env var could be a potential solution, you can have an independent process that injects whatever is required into this. Are there difficulties doing this?

I see this as a natural extension to the JAEGER_TAGS env var as seen in the Go and Java clients.

@jpkrohling I'm not convinced that applying these tags for spans reported by a given agent adds a lot of value over setting these environment variables and having the clients inject them into spans.

I'm not against the idea of enriching span data with information from the client, perhaps in the form of tags, but I think that the existing mechanism is good enough for the use case. (We should standardize across all official clients)

vprithvi on 24 May 2018

I'm not convinced that applying these tags for spans reported by a given agent adds a lot of value over setting these environment variables and having the clients inject them into spans.

On environments making use of containers, the container might not have access to the entire "context", like what's the actual node it's running on, or the DC/Rack. The agent might have this info, as it would potentially be running directly on the node (bare metal/VM), and not inside a container.

jpkrohling on 24 May 2018

I think it's fine to only support environment variable and ignore the plugin part of the request.
I can easily be handled by another process outside of the Agent which is totally fine for me.

Still, I don't think it's the same feature as JAEGER_TAGS environment variable.
The version of the Jaeger Agent can't easily be injected into every client for instance. It _could_ be shared between the Agent and the Client using the Control Flow endpoint, but it's not possible right now.

ledor473 on 24 May 2018

I'm a bit confused. If the agent is running as a daemon on the host, it by definition cannot have enough context to enrich the spans from a given service, because it is shared with other services. And if it is running as a sidecar, then I am not sure why it matters who gets the desired metadata via env, the sidecar or the main process.

yurishkuro on 24 May 2018

If the agent is running as a daemon on the host, it by definition cannot have enough context to enrich the spans from a given service

The agent wouldn't care what's inside the spans and which service is submitting it, just that they all should contain the tag tenant: xyz.

jpkrohling on 24 May 2018

I have no objections to supporting something like JAEGER_TAGS in the agent.

yurishkuro on 24 May 2018

Implementing the equivalent of JAEGER_TAGS in the Agent would work for our use case. I can edit the initial post to remove the "plugin" idea and focus around Agent-level tags.

The open question is still valid. It will help guide the implementation of it

ledor473 on 24 May 2018

Re open question - yes, these should go into tracer tags imo.

yurishkuro on 24 May 2018

Any progress on this? Our use-case is that we have several clusters runnings the same applications.

We'd like to filter by cluster, but adding that information into each application config seems non-ideal and error-prone.

gouthamve on 25 Jul 2018

I quickly look into implementing this but I couldn't find any easy place to hook into.
Unless I've missed something (totally possible as I've never looked at Thrift), the Agent seems to only forward/proxy what it receives to the Collector (with a buffer).

ledor473 on 5 Sep 2018

the Agent seems to only forward/proxy what it receives to the Collector (with a buffer).

pretty much. But we can add an enrichment step that will add some tags to the Process, so that we wouldn't need to change the data model.

yurishkuro on 6 Sep 2018

ledor473 on 19 Sep 2018

👍2

Note: #1396 implements agent tags, but only for gRPC reporting path (not for TChannel, which we're looking to deprecate).

yurishkuro on 19 Mar 2019

🎉1

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Streaming ingestion support in the collector

yurishkuro · 4Comments

Specialize Issue template

vprithvi · 3Comments

Flaky test: `TestReload`

albertteoh · 3Comments

Multiarch docker images

rur0 · 4Comments

Query endpoint api/services returns 500 if elasticsearch storage is empty

pavolloffay · 3Comments