Spring-cloud-sleuth: Error Reporting in Sleuth

Created on 14 Jun 2017  路  15Comments  路  Source: spring-cloud/spring-cloud-sleuth

Hello,

I currently try to get as much information reported as possible, when an exception is not caught in one of the microservices I am monitoring.
For test I set up a spring boot application that will throw an exception with a message. My Spans are sent to Kafka in JSON format. When looking at https://github.com/spring-cloud/spring-cloud-sleuth#visualizing-errors there seems to be a way to forward more information than just the status code I am currently reading from the Spans in Kafka.
I could not find any useful resource to help me to understand better how error/exception information can be handled in sleuth. Any hints would be greatly appreciated.

Best regards,
Lukas

question

Most helpful comment

Thank you so much for your help. Updating to Dalston.SR1 helped:

{
    "host": {
        "serviceName": "accounting-core-service",
        "address": "192.168.178.47",
        "port": 5001
    },
    "spans": [{
            "begin": 1497528329778,
            "end": 1497528329784,
            "name": "http:/accounting-core-service/deutsche-bahn/5/book",
            "traceId": -6109416087067669485,
            "parents": [-6109416087067669485],
            "spanId": 3778522118262908458,
            "remote": true,
            "exportable": true,
            "tags": {
                "mvc.controller.class": "BasicErrorController",
                "http.status_code": "500",
                "mvc.controller.method": "errorHtml",
                "error": "Request processing failed; nested exception is java.lang.Exception: This is the booking Exception!",
                "spring.instance_id": "Notebook:accounting-core-service:5001"
            },
            "logs": [{
                    "timestamp": 1497528329778,
                    "event": "sr"
                }, {
                    "timestamp": 1497528329784,
                    "event": "ss"
                }
            ],
            "durationMicros": 6273
        }
    ]
}
{
    "host": {
        "serviceName": "zuul-service",
        "address": "192.168.178.47",
        "port": 9000
    },
    "spans": [{
            "begin": 1497528329772,
            "end": 1497528329811,
            "name": "http:/accounting-core-service/deutsche-bahn/5/book",
            "traceId": -6109416087067669485,
            "parents": [-6109416087067669485],
            "spanId": 3778522118262908458,
            "exportable": true,
            "tags": {
                "http.status_code": "500",
                "lc": "zuul",
                "spring.instance_id": "Notebook:zuul-service:9000",
                "http.path": "/deutsche-bahn/5/book",
                "http.url": "/deutsche-bahn/5/book",
                "http.method": "GET"
            },
            "logs": [{
                    "timestamp": 1497528329772,
                    "event": "cs"
                }, {
                    "timestamp": 1497528329774,
                    "event": "cs"
                }, {
                    "timestamp": 1497528329811,
                    "event": "cr"
                }
            ],
            "durationMicros": 39186
        }
    ]
}

For anyone else who might read/use this solution: Do not forget to update the version of your spring-boot-starter-parent. That is what caused me the trouble with the application not starting anymore after updating to Dalston.SR1

Just one more thing (hopefully): Is there a convenient way to override the behavior how the error-tag is filled?

All 15 comments

It's enough for you to create a tag with key error and whatever value you want. If that's the case then Zipkin will put a red color to the rows as presented in the screenshots.

With the next Dalston release you'll also have the ErrorParser interface if you want to provide more customized error message - https://github.com/spring-cloud/spring-cloud-sleuth/pull/590.

So the feature shown at the position of the link is not implemented by default? I will handle the messages myself. So it does not have to match the zipkin-server. I was just assuming the information required for the described case will be written somewhere so the zipkin server can handle them - and I could use those information to do the different things I want to do.

It is implemented by default but you can customize what you want to be there in the error tag's value. So the answer to your question is that if there's an exception thrown that we're catching, then Sleuth will automatically add the error tag and value will be the exception message. If you want to customize that value you'll need to use the ErrorParser feature. If you want to provide a custom error tag to make Zipkin colourize your span red - just add the error tag with some arbitrary value you want.

Ok - when I call a method like:

@RequestMapping(
            value = {"/{service}/{route_id}/book", "/accounting-core-service/{service}/{route_id}/book"},
            method = RequestMethod.GET)
    @ResponseBody
    public ModelAndView book(@PathVariable(value="service") String serviceName,
                             @PathVariable(value = "route_id") int routeId,
                             HttpServletResponse response) throws Exception {

        throw new Exception("This is the booking Exception!");
        }

I received the following JSONS for two Spans-Objects written to Kafka:

{
    "host": {
        "serviceName": "zuul-service",
        "address": "192.168.178.47",
        "port": 9000
    },
    "spans": [{
            "begin": 1497442779186,
            "end": 1497442779924,
            "name": "http:/accounting-core-service/deutsche-bahn/5/book",
            "traceId": -6513692843231898233,
            "parents": [-6513692843231898233],
            "spanId": -7158068482129368830,
            "exportable": true,
            "tags": {
                "http.path": "/accounting-core-service/deutsche-bahn/5/book",
                "http.url": "/accounting-core-service/deutsche-bahn/5/book",
                "http.status_code": "500",
                "http.method": "GET",
                "lc": "zuul"
            },
            "logs": [{
                    "timestamp": 1497442779186,
                    "event": "cs"
                }, {
                    "timestamp": 1497442779924,
                    "event": "cr"
                }
            ],
            "durationMicros": 737509
        }
    ]
}   

{
        "host": {
            "serviceName": "accounting-core-service",
            "address": "192.168.178.47",
            "port": 5001
        },
        "spans": [{
                "begin": 1497442779620,
                "end": 1497442779652,
                "name": "book",
                "traceId": -6513692843231898233,
                "parents": [-7158068482129368830],
                "spanId": -3422535960945121632,
                "exportable": true,
                "tags": {
                    "mvc.controller.method": "book",
                    "mvc.controller.class": "AccountServiceController"
                },
                "logs": [],
                "durationMicros": 32174
            }, {
                "begin": 1497442779427,
                "end": 1497442779664,
                "name": "http:/accounting-core-service/deutsche-bahn/5/book",
                "traceId": -6513692843231898233,
                "parents": [-6513692843231898233],
                "spanId": -7158068482129368830,
                "remote": true,
                "exportable": true,
                "tags": {
                    "http.status_code": "500"
                },
                "logs": [{
                        "timestamp": 1497442779428,
                        "event": "sr"
                    }, {
                        "timestamp": 1497442779664,
                        "event": "ss"
                    }
                ],
                "durationMicros": 237122
            }
        ]
    }

Do I make wrong assumptions how exceptions are caught? Or what am I missing?

Which version of Sleuth are you using? Do you have in the logs on the controller side an exception thrown and in the log message you can see that there is no span / trace id set?

You could also run this with Sleuth logging at debug mode and send me the logs.

I am using:

    <dependencyManagement>
        <dependencies>
            <dependency>
                <groupId>org.springframework.cloud</groupId>
                <artifactId>spring-cloud-dependencies</artifactId>
                <version>Camden.RELEASE</version>
                <type>pom</type>
                <scope>import</scope>
            </dependency>
        </dependencies>
    </dependencyManagement>

That leads to spring-cloud-sleuth-stream 1.0.9.RELEASE and spring-cloud-stream-binder-kafka 1.1.0.RELEASE

And here are some lines of the log containing the thrown exception:
logs.txt

Can you upgrade to Dalston.SR1 and try again? Or at least to latest Camden?

I tried the version update to Dalston.SR1 however that results into TypeNotPresentExceptions that I will have to figure out first. I will update here as soon as I get it working.

Thank you so much for your help. Updating to Dalston.SR1 helped:

{
    "host": {
        "serviceName": "accounting-core-service",
        "address": "192.168.178.47",
        "port": 5001
    },
    "spans": [{
            "begin": 1497528329778,
            "end": 1497528329784,
            "name": "http:/accounting-core-service/deutsche-bahn/5/book",
            "traceId": -6109416087067669485,
            "parents": [-6109416087067669485],
            "spanId": 3778522118262908458,
            "remote": true,
            "exportable": true,
            "tags": {
                "mvc.controller.class": "BasicErrorController",
                "http.status_code": "500",
                "mvc.controller.method": "errorHtml",
                "error": "Request processing failed; nested exception is java.lang.Exception: This is the booking Exception!",
                "spring.instance_id": "Notebook:accounting-core-service:5001"
            },
            "logs": [{
                    "timestamp": 1497528329778,
                    "event": "sr"
                }, {
                    "timestamp": 1497528329784,
                    "event": "ss"
                }
            ],
            "durationMicros": 6273
        }
    ]
}
{
    "host": {
        "serviceName": "zuul-service",
        "address": "192.168.178.47",
        "port": 9000
    },
    "spans": [{
            "begin": 1497528329772,
            "end": 1497528329811,
            "name": "http:/accounting-core-service/deutsche-bahn/5/book",
            "traceId": -6109416087067669485,
            "parents": [-6109416087067669485],
            "spanId": 3778522118262908458,
            "exportable": true,
            "tags": {
                "http.status_code": "500",
                "lc": "zuul",
                "spring.instance_id": "Notebook:zuul-service:9000",
                "http.path": "/deutsche-bahn/5/book",
                "http.url": "/deutsche-bahn/5/book",
                "http.method": "GET"
            },
            "logs": [{
                    "timestamp": 1497528329772,
                    "event": "cs"
                }, {
                    "timestamp": 1497528329774,
                    "event": "cs"
                }, {
                    "timestamp": 1497528329811,
                    "event": "cr"
                }
            ],
            "durationMicros": 39186
        }
    ]
}

For anyone else who might read/use this solution: Do not forget to update the version of your spring-boot-starter-parent. That is what caused me the trouble with the application not starting anymore after updating to Dalston.SR1

Just one more thing (hopefully): Is there a convenient way to override the behavior how the error-tag is filled?

Like I mentioned previously

With the next Dalston release you'll also have the ErrorParser interface if you want to provide more customized error message - #590.

Perfect. Any schedule when that release will be available?

Nothing concrete yet.

I just discovered a case that seems to be only partly covered: Timeout Exceptions thrown by the zuul service while the main processing is done by a different service.

In this case I only get the following tags on the BasicErrorController span:
Map(mvc.controller.class -> BasicErrorController, mvc.controller.method -> errorHtml, spring.instance_id -> Notebook:zuul-service:9000, http.status_code -> 500)

The exception in zuul is the following: exception.txt

For some reason (I assume as the error is not thrown in the same service as the processing is done at that moment) the error tag is not set and filled.

Can you file a separate issue for this?

Was this page helpful?
0 / 5 - 0 ratings