Describe the bug
Re-opening spring-projects/spring-cloud-sleuth#1507 because it looks like there is a regression linked to this bug.
Reminder of the encountered issue:
Once using Sleuth and injecting Tracing headers in a WebTestClient I was able to retrieve them in Reactor's context later on like this :
// Test
webTestClient.get()
.uri("/api/tracing")
.header("X-B3-TraceId", "463ac35c9f6413ad48485a3953bb6124")
.header("X-B3-SpanId", "a2fb4a1d1a96d312")
// Controller
@GetMapping
fun spanId(): Mono<String> {
return Mono.subscriberContext()
.filter { it.hasKey(Span::class.java) }
.map { it[Span::class.java].context() }
.map { it.spanIdString() }
}
When comparing spanId and parentId present in the org.springframework.cloud.sleuth.TraceContext object, it appears that values are different than the spanId and parentId set in the headers.
Spring libraries versions used to reproduce this issue:
Sample
Sample available here
The branch 2021 contains latest changes to reproduce the issue with latest libraries (project forked from the initial reproducer)
The issue can be reproduced by executing the test com.spring.reproducer.spring.test.sleuth.reproducer.TracingResourceTest
Is this bug report only about tests and WebTestClient? When specifying a trace and span ID in the request header for Spring Cloud Gateway (based on WebFlux / Reactor) the values are ignored and new traces are generated by the application -- like in the test case linked above.
What are the headers that are being passed?
I've tried the B3 headers as well as the newer ones (_localhost:9095_ is a Spring Cloud Gateway instance):
curl -sv http://localhost:9095/sample-route -H "X-B3-TraceId: explicit_X-B3-TraceId" -H "X-B3-SpanId: explicit_X-B3-SpanId" -H "traceId: explicit_traceId" -H "spanId: explicit_spanId"
The logs (or MDC) does not show the "explicit" values but new traces like _fa2ed5b4165d32e9_.
Furthermore, also the downstream service that is called by the gateway receives the trace IDs generated by the gateway instead of the manually passed ones:
2021-02-15 10:37:02.214 DEBUG [my-downstream-service,,] 4256 --- [io-8089-exec-10] o.a.coyote.http11.Http11InputBuffer : Received [GET /foo HTTP/1.1
User-Agent: curl/7.68.0
traceId: explicit_traceId
spanId: explicit_spanId
X-Forwarded-Port: 9095
X-Forwarded-Host: 172.28.96.1:9095
X-B3-TraceId: fa2ed5b4165d32e9
X-B3-SpanId: e96af85cf01d6eda
X-B3-ParentSpanId: fa2ed5b4165d32e9
X-B3-Sampled: 0
...
]
Note that the traceId and spanId we see in the request headers of the downstream service is probably just a "pass-through" by the gateway. I'd expect that the X-B3-TraceId value equals explicit_traceId as well.
X-B3-TraceId and X-B3-SpanId must be valid longs. If they are not then AFAIR we're generating new ones
Gosh... indeed... yeah... it's working as expected with proper values... seems like I've fooled myself during testing ;-)
Thanks for the fast response, @marcingrzejszczak and sorry for hijacking this bug report!
No problem, glad that things are working fine :)
Hello,
For information, I face the issue in the reproducer using both the parentId and spanId (Reproduced with X-B3 field as well as the new spanId and parentId fields). When setting these fields explicitely in the header, I get different ones in the response body
Actually there are no new spanId and parentId - those are only set in MDC in a new format. For B3 headers you can either set the X-B3... or the b3 field that has both span, trace and optionally parent ids.
For information, I face the issue in the reproducer using both the parentId and spanId (Reproduced with X-B3 field as well as the new spanId and parentId fields). When setting these fields explicitely in the header, I get different ones in the response body
you're saying that you send (the ids are just a sample)
$ curl -sv http://localhost:9095/sample-route -H "X-B3-TraceId: 1" -H "X-B3-SpanId: 2
and in the gateway those headers are ignored and traceid and spanid are set to other values?
Yes, as you can see in the following image, I'm doing the curl command and the value set in the header is ignored:

I don't think 1 and 2 are valid values. Can you set propert values there? E.g.
$ curl -sv http://localhost:9095/sample-route -H "X-B3-TraceId: fa2ed5b4165d32e9" -H "X-B3-SpanId: fa2ed5b4165d32e9"
Interestingly it does not work with ID values smaller than 10. For me it is also not working with X-B3-TraceId: 1 (GW generates new values), but it does with e.g. X-B3-TraceId: 10.
Update: hehe, simultaneous post :)
I also have another value set :

I have no idea what is the thing that you're returning. Are values in the logs wrong? Is the trace data propagated properly? What do the Sleuth logs say?
Let me try to summarize what I'm doing when I execute my test. The first step is a call using the webTestClient where I set the headers:

At this step, the spandId set in the header is "a2fb4a1d1a96d312"
Then, when I execute this test in debug mode, I'm able to visualize the content of the TraceContext instance (import org.springframework.cloud.sleuth.TraceContext)
I can visualize the following content :

My understanding is that the value of the spanId is never propagated to the TraceContext.
If you need some sleuth log, I have to setup logging in the reproducer. I can do it in the meantime.
Wait a second - the trace context looks ok, the spanid is the child of the a2f... span. I don't understand what the problem is? We automatically created a child span.
Well, I'm calling only one service from my test and my understanding is that the spandId and parentId would be equal to the ones I'm setting in the header.
Am I misunderstanding how the traceContext should be instanciated?
It depends on what you're calling. What does happen properly is that on the client side the request headers were set properly and the trace context was successfully propagated. I have no idea what your controller does and how it looks like.
The controller does nothing more than what I shared in the reproducer : Controller link here
the steps are:
Most likely that's a component somewhere in the reactor instrumentation that is creating a child span. I don't see a bug here nor a regression. The context is properly propagated.
What confuses me is that it was the same issue in the following ticket: pring-projects/spring-cloud-sleuth#1507
In older versions of spring boot, the test in the reproducer actually passed (that's why I consider it a regression because the test passed until we move to Spring boot 2.4). The reproducer is a simple extract of an automated test of our project so we have detected that the behaviour around the propagation of the spanId/parentId seems different
Ok I'll triple check this when I have time. Thanks for the reproducer.
I'm sorry but I can't import the project to my IDE. Can you rewrite the sample to Java and Maven?
Hello @marcingrzejszczak, I'm going to take a look to convert it today in java/maven on a separate brach
@marcingrzejszczak You can find the converted reproducer at the following location : here
The test class to execute is TracingResourceTest.java
I know what the problem is. Brave has the notion of Propagation and Propagation.Factory. When I introduced the composite of propagation types I have not overriden the Propagation.Factory methods. That means that what you see that happens is that supportJoin is equal to false. The trace headers are properly parsed but since support join is disabled it creates a new child span.
As a workaround what you can do is create your own bean like this
@Bean
BaggagePropagation.FactoryBuilder myBaggagePropagationFactoryBuilder() {
return BaggagePropagation.newFactoryBuilder(B3Propagation.newFactoryBuilder().injectFormat(B3Propagation.Format.SINGLE_NO_PARENT).build());
}
and then you'll override the faulty composite one that I'll try to fix asap.
Most helpful comment
I know what the problem is. Brave has the notion of
PropagationandPropagation.Factory. When I introduced the composite of propagation types I have not overriden thePropagation.Factorymethods. That means that what you see that happens is thatsupportJoinis equal tofalse. The trace headers are properly parsed but since support join is disabled it creates a new child span.As a workaround what you can do is create your own bean like this
and then you'll override the faulty composite one that I'll try to fix asap.