Disclaimer: I am not sure wether this bug is better reported here or at the Spring Cloud Gateway project. Please advice if this is not the right place here.
Describe the bug
We upgraded our spring boot app using Spring Cloud Gateway to use Spring Boot 2.4.2 and Spring Cloud 2020.0.0. We are now experiencing the following problems with our traces / spans send to the zipkin endpoint of our jaeger instance.
1) Each incoming HTTP request on the Spring Cloud Gateway produces three internal spans. One of these spans is now getting a new traceid and can therefore not be correlated anymore in zipkin / jaeger.
2) The service name tag is now "default" and no longer equals the spring application name.
UPDATE: The second issue was related to bootstrap.yml not being loaded anymore where we defined the application name.
Sample
List of spans (before Spring Boot / Spring Cloud update)
[
{
"traceId": "91d56e081e36e8bc",
"parentId": "91d56e081e36e8bc",
"id": "ca04a2bd19e44922",
"kind": "CLIENT",
"name": "post",
"timestamp": 1612437023562700,
"duration": 543708,
"localEndpoint": {
"serviceName": "csm-cloud-api"
},
"tags": {
"http.method": "POST",
"http.path": "/v1/projects/milestones/search"
}
},
{
"traceId": "91d56e081e36e8bc",
"parentId": "91d56e081e36e8bc",
"id": "d104472ec8ae1c69",
"kind": "CLIENT",
"name": "post",
"timestamp": 1612437023481568,
"duration": 632290,
"localEndpoint": {
"serviceName": "csm-cloud-api"
},
"tags": {
"http.method": "POST",
"http.path": "/v1/projects/milestones/search"
}
},
{
"traceId": "91d56e081e36e8bc",
"id": "91d56e081e36e8bc",
"kind": "SERVER",
"name": "post",
"timestamp": 1612437023425027,
"duration": 707790,
"localEndpoint": {
"serviceName": "csm-cloud-api"
},
"tags": {
"http.method": "POST",
"http.path": "/v1/projects/milestones/search"
}
}
]
List of spans (after Spring Boot / Spring Cloud update)
[
{
"traceId": "f3670d13e7086802",
"parentId": "f3670d13e7086802",
"id": "a95630490131e5fe",
"kind": "CLIENT",
"name": "post",
"timestamp": 1612435894235902,
"duration": 647587,
"localEndpoint": {
"serviceName": "default",
"ipv4": "172.18.0.1"
},
"tags": {
"http.method": "POST",
"http.path": "/v1/projects/milestones/search"
}
},
{
"traceId": "070c3bf6f58a35ed", <---- New Trace ID although this span belongs to the same http request
"parentId": "1fbd94e08dbf784d",
"id": "9eed61d9bc0f6a4e",
"kind": "CLIENT",
"name": "post",
"timestamp": 1612435894105856,
"duration": 796715,
"localEndpoint": {
"serviceName": "default",
"ipv4": "172.18.0.1"
},
"tags": {
"http.method": "POST",
"http.path": "/v1/projects/milestones/search"
}
},
{
"traceId": "f3670d13e7086802",
"id": "f3670d13e7086802",
"kind": "SERVER",
"name": "post",
"timestamp": 1612435894050843,
"duration": 874228,
"localEndpoint": {
"serviceName": "default",
"ipv4": "172.18.0.1"
},
"tags": {
"http.method": "POST",
"http.path": "/v1/projects/milestones/search"
}
}
]
Can you create a sample that replicates this?
@marcingrzejszczak What exactly do you have in mind? A repo with sample code? That would take a while I guess,
Correct
HI @marcingrzejszczak you can find an example here: https://github.com/larsduelfer/spring-cloud-gateway-tracing-issue
I contains of a vanilla spring cloud gateway routing to a simple hello controller of the backend service.
As mentioned in the updated inital post above, the naming issue was solved meanwhile. This problem was related to bootstrap.yml not being loaded on application start anymore.
The issue with the span containing a different trace ID can be demonstrated with the example I provided.
One more thing: I never really understood, why there are three spans for the routing in the api gateway. 2 would be sufficient, one for the incoming request and one for the routed, outgoing request. I have not yet understood, what the third span represents.
Is there any update on this?
Nope. I didn't have time to look into it.
From debugging. I saw that the TraceWebFilter creates the first span with a new trace id (as expected). Next, the TraceRequestHttpHeadersFilter creates another span, again with a new trace id. The span for the forwarded request to the downstream service (the one the request is routed to) contains the first trace id again. It seems, that there is a bug in the TraceRequestHttpHeadersFilter. This one should not start a new trace in my opinion.
Unfortunately, I have not yet enough understanding of the code to provide a fix for this.
Hi @marcingrzejszczak I am using version 3.0.2 now where the commit for this issue should be integrated (is this right?). I still see the extra span that I reported here. Any ideas?
Maybe you have a version mismatch? You can double check with a breakpoint if the Gateway instrumentation hooks in (it shouldn't).
Hi @marcingrzejszczak I got it working now. Appologies for the confusion, I had indeed a misconfiguration on my side. Updating to spring cloud 2020.0.2 solved the issue. Thanks a lot!