Envoy: Tracing headers not propagated when x-ot-span-context header is omitted

Created on 1 Aug 2017  路  9Comments  路  Source: envoyproxy/envoy

Within Zipkin, there isn't a concept of an 'x-ot-span-context' header. However, the initial version of the Zipkin support within Envoy requires this header to be present for Envoy to propagate the B3 headers downstream. The following is an example of a service making an inter-service call through Envoy.

Current behavior:

T 172.17.0.1:37180 -> 172.17.0.3:80 [AP]
GET / HTTP/1.1.
Host: obfuscated:8080.
x-b3-traceid: df2a47707d36857e.
x-b3-spanid: 1fcdac653464b039.
x-b3-parentspanid: de13127f091b8510.
x-b3-sampled: 1.
.

T 172.17.0.3:38092 -> 172.17.0.1:80 [AP]
GET / HTTP/1.1.
host: obfuscated:8080.
x-b3-traceid: 0000943861e352ca.
x-b3-spanid: 0000943861e352ca.
x-b3-parentspanid: de13127f091b8510.
x-b3-sampled: 1.
x-forwarded-proto: http.
x-request-id: f30826a5-e41a-99f4-a9ac-6daef36f4dd3.
x-ot-span-context: 0000943861e352ca;0000943861e352ca;0000000000000000;cs.

Notice in the current behavior that the x-b3-traceid does not match the existing trace identifier. It is dropped, and essentially a new trace is created. Also note, the parentspanid has an erroneous value. If Envoy decided to initiate a new and independent trace, the parent should be absent or null, and not use the parent of an unrelated trace.

Expected behavior:

T 172.17.0.3:38092 -> 172.17.0.1:80 [AP]
GET / HTTP/1.1.
host: obfuscated:8080.
x-b3-traceid: df2a47707d36857e.
x-b3-spanid: 0000943861e352ca.
x-b3-parentspanid: 1fcdac653464b039.
x-b3-sampled: 1.
x-forwarded-proto: http.
x-request-id: f30826a5-e41a-99f4-a9ac-6daef36f4dd3.

Notice in the expected behavior, x-b3-traceid remains the same as the initial request. parentspanid becomes the spanid from the upstream request, and spanid is a newly generated identifier.

Without Envoy, Trace T has one root span called X, Service A calls Service B. Service A annotates span X with CS CR. Optional: Service B annotates span X with SR SS.

When using Envoy, Trace T has one root span Y, and one child span Y. Service A calls Service B via Envoy. Service A annotates span X with CS CR. Envoy annotates span X with SR SS. Envoy injects span Y and annotates with CS CR. Optional: Service B annotates span Y with SR SS.

@mattklein123 @fabolive @rshriram @adriancole

aretracing bug help wanted

Most helpful comment

@mattklein123 I believe it is possible to rework the entire implementation so as to obviate the need for the x-ot-span-context header. I should have some cycles next week to make an assessment.

All 9 comments

@fabolive sorry I can't assign this to you. Can you and @rshriram take a look at this when you have some cycles and propose a solution? Thank you.

I was about to report the same thing.

To resolve the overwritten x-b3-traceid, would it make sense to change the code responsible for constructing the x-ot-span-context such that it first checks if the incoming request is internal, and if so, use the existing x-b3-traceid (if present)?

If that sounds good, I could put that on my queue (but I can't give an estimate of when I could deliver at the moment).

@adriancole if you have a bit of time do you think you could help advise us on what the appropriate behavior here should be? I think given that we can find someone to work on a fix.

@mattklein123 I believe it is possible to rework the entire implementation so as to obviate the need for the x-ot-span-context header. I should have some cycles next week to make an assessment.

The problem and behavior description of what should happen sounds legit to me

Could I suggest a slightly different fix - the x-ot-span-context is useful in cases where the application is responsible for propagating the context, as it is easier to just deal with a single header property.

However this causes problems if the application itself is instrumented, as the changes to the trace context are not propagated to the egress proxy.

If both the x-ot-span-context and B3 headers are passed in, as now - if the application is just passing the context through, then it could just use the x-ot-span-context - so the egress proxy would only receive it, and decode it to obtain the constituent parts.

However if a zipkin compatible tracer is being used by the application, it would take and build upon the B3 headers, and would eventually only pass the B3 headers to the egress proxy - thus the proxy knowing that the trace context must have been updated within the application.

So caters for simplified context passing, while still allowing the app to add its own internal tracing if required.

Is there any update on this issue? I seem to have run into this as well.

I'm marking this issue "help wanted." Someone just needs to step up and fix this. @objectiser any interest?

@mattklein123 will have a look early next week and let you know.

Was this page helpful?
0 / 5 - 0 ratings