There seems still be problem after the fix #764 .
flatmap works fine if there is not WebClient involved:
return Mono.just("").flatMap(s -> {
log.info("s1 {}", tracer.getCurrentSpan().toString()); // Not null and good
return Mono.just("");
});
However inside the flatMap of WebClient, the context is lost (not tracing going on) and a new traceId is generated if there are subsequent outgoing requests:
return webClient.get().uri(url).exchange()
.flatMap(r -> {
// here no traceId nor spanId
log.info("s1 {}", tracer.getCurrentSpan().toString()); // NullPointerException
return Mono.just("");
});
webClient is Autowired and the upstream can get the traceId in the first call.
Version:
spring-boot-starter-parent 2.0.0.M7
spring-cloud-sleuth 2.0.0.M5
Could you please check this after the Brave migration #711 ?
I think I'm experiencing the same/ a similar problem (and it also might be related to https://github.com/spring-cloud/spring-cloud-sleuth/issues/850).
I'm using Spring Boot with Webflux RC01 and Spring Cloud Sleuth M6.
We use a WebClient to send requests (with flatMap after the exchange). The traceId (and I think the span to) is there until the WebClient gets the response after the exchange, a thread change appears and the traceId is gone.
Log:
2018-02-17 09:29:27.992 TRACE [application-name,48221e9569d3b656,23eba2118dc28bb6,true] 13424 --- [ctor-http-nio-3] o.s.c.s.log.Slf4jCurrentTraceContext : With parent: 711504781224515631
2018-02-17 09:29:27.992 TRACE [application-name,48221e9569d3b656,23eba2118dc28bb6,true] 13424 --- [ctor-http-nio-3] o.s.c.s.log.Slf4jCurrentTraceContext : With parent: 711504781224515631
2018-02-17 09:29:28.054 DEBUG [application-name,48221e9569d3b656,23eba2118dc28bb6,true] 13424 --- [ctor-http-nio-3] c.d.t.tex.handlers.TokenExchangeHandler : Processing of request started
2018-02-17 09:29:28.124 DEBUG [application-name,48221e9569d3b656,23eba2118dc28bb6,true] 13424 --- [ctor-http-nio-3] c.d.t.t.modules.ClassWithWebClient : Sending request with WebClient
2018-02-17 09:29:28.125 DEBUG [application-name,48221e9569d3b656,23eba2118dc28bb6,true] 13424 --- [ctor-http-nio-3] .s.c.s.i.w.c.TraceExchangeFilterFunction : Creating a client span for the WebClient
2018-02-17 09:29:28.167 TRACE [application-name,48221e9569d3b656,a09a95f85b3fe3e5,true] 13424 --- [ctor-http-nio-3] o.s.c.s.log.Slf4jCurrentTraceContext : With parent: 2588340607130962870
2018-02-17 09:29:28.221 DEBUG [application-name,48221e9569d3b656,23eba2118dc28bb6,true] 13424 --- [ctor-http-nio-3] .s.c.s.i.w.c.TraceExchangeFilterFunction : Reactor Context got injected with the client span RealSpan(48221e9569d3b656/a09a95f85b3fe3e5)
2018-02-17 09:29:28.709 TRACE [application-name,48221e9569d3b656,a09a95f85b3fe3e5,true] 13424 --- [ctor-http-nio-5] o.s.c.s.log.Slf4jCurrentTraceContext : With parent: 2588340607130962870
2018-02-17 09:29:28.792 TRACE [application-name,48221e9569d3b656,a09a95f85b3fe3e5,true] 13424 --- [ctor-http-nio-5] o.s.c.s.log.Slf4jCurrentTraceContext : With parent: 2588340607130962870
2018-02-17 09:29:28.849 DEBUG [application-name,,,] 13424 --- [ctor-http-nio-5] c.d.t.t.modules.ClassWithWebClient : Received response from WebClient
I'm not sure how this all works, but what I'm seeing is that in Slf4jCurrentTraceContext directly after the thread change in line 69 the previousTraceId is loaded from the MDC:
final String previousTraceId = MDC.get("traceId");
Through the MDC these ids seem to be stored thread local, and MDC.get returns null. Later in line 117 these null values are set in the MDC and the traceId in the MDC is overrwritten (and therefore not available for logging):
replace("traceId", previousTraceId);
@sbueringer can you double check if the problem is gone after fixing https://github.com/spring-cloud/spring-cloud-sleuth/issues/850 ?
Sorry for late confirmation however the problem is still there on 2.0.0.M8 .
And being more confused, i found that the spanId is never changing...Not the same behavior in old versions.
log.info("Outside"); // [b2cda26676756307,b2cda26676756307] [XNIO-2 task-1]
return Mono.just(new ResponseEntity<>("GOOD", HttpStatus.OK)).flatMap(ss->{
log.info("Inside"); //[b2cda26676756307,b2cda26676756307] [XNIO-2 task-1]
return webClient.get().uri("https://www.google.com").exchange().flatMap(s1->{
log.info(s1.statusCode().toString()); //[b2cda26676756307,b2cda26676756307] [reactor-http-nio-4]
return s1.bodyToMono(String.class).flatMap(html ->{
log.info(String.valueOf(html.length())); // [,] [reactor-http-nio-4]
return webClient.get().uri("https://www.google.com").exchange().flatMap(s2->{
log.info(s2.statusCode().toString()); // [2ec715a23375dd4b,2ec715a23375dd4b] [reactor-http-nio-5]
return s2.bodyToMono(String.class).flatMap(html2 ->{
log.info(String.valueOf(html2.length())); // [,] [reactor-http-nio-5]
return Mono.just(ss);
});
});
});
});
});
I can't replicate it with snapshots (I modified your test a little bit)
2018-03-16 10:24:47.691 DEBUG [TraceWebFluxTests,,,] 57738 --- [o-auto-1-exec-7] o.s.c.s.instrument.web.TraceWebFilter : Received a request to uri [/withFlatMap2]
2018-03-16 10:24:47.691 DEBUG [TraceWebFluxTests,,,] 57738 --- [o-auto-1-exec-7] o.s.c.s.instrument.web.TraceWebFilter : Handled receive of span RealSpan(b7e97a826a4ef451/b7e97a826a4ef451)
2018-03-16 10:24:47.693 INFO [TraceWebFluxTests,b7e97a826a4ef451,b7e97a826a4ef451,true] 57738 --- [o-auto-1-exec-7] o.s.c.s.i.reactor.sample.FlatMapTests : withFlatMap2
2018-03-16 10:24:47.693 INFO [TraceWebFluxTests,b7e97a826a4ef451,b7e97a826a4ef451,true] 57738 --- [o-auto-1-exec-7] o.s.c.s.i.reactor.sample.FlatMapTests : Outside
2018-03-16 10:24:47.695 INFO [TraceWebFluxTests,b7e97a826a4ef451,b7e97a826a4ef451,true] 57738 --- [o-auto-1-exec-7] o.s.c.s.i.reactor.sample.FlatMapTests : Inside
2018-03-16 10:24:47.697 DEBUG [TraceWebFluxTests,b7e97a826a4ef451,b7e97a826a4ef451,true] 57738 --- [o-auto-1-exec-7] .s.c.s.i.w.c.TraceExchangeFilterFunction : Instrumenting WebClient call
2018-03-16 10:24:47.697 DEBUG [TraceWebFluxTests,b7e97a826a4ef451,b7e97a826a4ef451,true] 57738 --- [o-auto-1-exec-7] .s.c.s.i.w.c.TraceExchangeFilterFunction : Created a client span for the WebClient RealSpan(b7e97a826a4ef451/75481d02d86cfc51)
2018-03-16 10:24:47.700 DEBUG [TraceWebFluxTests,,,] 57738 --- [-auto-1-exec-10] o.s.c.s.instrument.web.TraceWebFilter : Received a request to uri [/foo]
2018-03-16 10:24:47.700 DEBUG [TraceWebFluxTests,,,] 57738 --- [-auto-1-exec-10] o.s.c.s.instrument.web.TraceWebFilter : Handled receive of span RealSpan(b7e97a826a4ef451/75481d02d86cfc51)
2018-03-16 10:24:47.701 INFO [TraceWebFluxTests,b7e97a826a4ef451,75481d02d86cfc51,true] 57738 --- [-auto-1-exec-10] o.s.c.s.i.reactor.sample.FlatMapTests : foo
2018-03-16 10:24:47.702 DEBUG [TraceWebFluxTests,b7e97a826a4ef451,75481d02d86cfc51,true] 57738 --- [-auto-1-exec-10] o.s.c.s.instrument.web.TraceWebFilter : Found span in context RealSpan(b7e97a826a4ef451/75481d02d86cfc51)
2018-03-16 10:24:47.702 DEBUG [TraceWebFluxTests,b7e97a826a4ef451,75481d02d86cfc51,true] 57738 --- [-auto-1-exec-10] o.s.c.s.instrument.web.TraceWebFilter : Handled send of RealSpan(b7e97a826a4ef451/75481d02d86cfc51)
2018-03-16 10:24:47.702 INFO [TraceWebFluxTests,b7e97a826a4ef451,b7e97a826a4ef451,true] 57738 --- [ctor-http-nio-3] o.s.c.s.i.reactor.sample.FlatMapTests : First flat map
2018-03-16 10:24:47.704 INFO [TraceWebFluxTests,b7e97a826a4ef451,b7e97a826a4ef451,true] 57738 --- [ctor-http-nio-3] o.s.c.s.i.reactor.sample.FlatMapTests : Second flat map
2018-03-16 10:24:47.704 DEBUG [TraceWebFluxTests,b7e97a826a4ef451,b7e97a826a4ef451,true] 57738 --- [ctor-http-nio-3] .s.c.s.i.w.c.TraceExchangeFilterFunction : Instrumenting WebClient call
2018-03-16 10:24:47.705 DEBUG [TraceWebFluxTests,b7e97a826a4ef451,b7e97a826a4ef451,true] 57738 --- [ctor-http-nio-3] .s.c.s.i.w.c.TraceExchangeFilterFunction : Created a client span for the WebClient RealSpan(b7e97a826a4ef451/801e7f6e64c0787f)
2018-03-16 10:24:47.707 DEBUG [TraceWebFluxTests,,,] 57738 --- [o-auto-1-exec-9] o.s.c.s.instrument.web.TraceWebFilter : Received a request to uri [/foo]
2018-03-16 10:24:47.707 DEBUG [TraceWebFluxTests,,,] 57738 --- [o-auto-1-exec-9] o.s.c.s.instrument.web.TraceWebFilter : Handled receive of span RealSpan(b7e97a826a4ef451/801e7f6e64c0787f)
2018-03-16 10:24:47.708 INFO [TraceWebFluxTests,b7e97a826a4ef451,801e7f6e64c0787f,true] 57738 --- [o-auto-1-exec-9] o.s.c.s.i.reactor.sample.FlatMapTests : foo
2018-03-16 10:24:47.709 DEBUG [TraceWebFluxTests,b7e97a826a4ef451,801e7f6e64c0787f,true] 57738 --- [o-auto-1-exec-9] o.s.c.s.instrument.web.TraceWebFilter : Found span in context RealSpan(b7e97a826a4ef451/801e7f6e64c0787f)
2018-03-16 10:24:47.709 DEBUG [TraceWebFluxTests,b7e97a826a4ef451,801e7f6e64c0787f,true] 57738 --- [o-auto-1-exec-9] o.s.c.s.instrument.web.TraceWebFilter : Handled send of RealSpan(b7e97a826a4ef451/801e7f6e64c0787f)
2018-03-16 10:24:47.709 INFO [TraceWebFluxTests,b7e97a826a4ef451,b7e97a826a4ef451,true] 57738 --- [ctor-http-nio-6] o.s.c.s.i.reactor.sample.FlatMapTests : Third flat map
2018-03-16 10:24:47.711 INFO [TraceWebFluxTests,b7e97a826a4ef451,b7e97a826a4ef451,true] 57738 --- [ctor-http-nio-6] o.s.c.s.i.reactor.sample.FlatMapTests : Fourth flat map
2018-03-16 10:24:47.712 DEBUG [TraceWebFluxTests,b7e97a826a4ef451,b7e97a826a4ef451,true] 57738 --- [ctor-http-nio-6] o.s.c.s.instrument.web.TraceWebFilter : Found span in context RealSpan(b7e97a826a4ef451/b7e97a826a4ef451)
2018-03-16 10:24:47.712 DEBUG [TraceWebFluxTests,b7e97a826a4ef451,b7e97a826a4ef451,true] 57738 --- [ctor-http-nio-6] o.s.c.s.instrument.web.TraceWebFilter : Handled send of RealSpan(b7e97a826a4ef451/b7e97a826a4ef451)
Can you confirm that it's working fine with snapshots?
As for having the same span. That's completely deliberate. We are passing the same span unless you create a new one.
ping @latifrons ?
Closing due to lack of feedback
From my point of view logging from flatMap() or map() seems to be working in M8. But after switching to M9 span are exchanged any more as it seems.
Can you provide a sample?
Hi
Sorry for the delay. Here is a sample: https://github.com/joke/sleuth
This might be related to https://github.com/spring-cloud/spring-cloud-sleuth/issues/949
Fixed via https://github.com/spring-cloud/spring-cloud-sleuth/commit/51abe7bba75a9dc9943d8dbbbb8987fcf68dd97b
@joke Can you double check that the snapshots work fine?
:+1: works perfectly. thank you.
Most helpful comment
:+1: works perfectly. thank you.