Kong: RFC (pdk): standardized tracing api and core instrumentation

Created on 3 Aug 2020  路  5Comments  路  Source: Kong/kong

Summary

Currently, Kong has no tracing instrumentation in its core code. kong-plugin-zipkin exists, and is useful for getting a general sense of the timing of the different Nginx phases, but I'm seeing some unexplained latency that the plugin is not capturing.

I believe it would be hugely beneficial if Kong core took a lead in standardizing a tracing API within the PDK which could then be extended/implemented in plugins like the existing Zipkin one, or perhaps, a future OpenTelemetry plugin.

With a standardized API, Kong core could be instrumented in a concrete and fine-grained way. The API could be written in a way that the default implementation in core is a no-op, and only when a tracing plugin is enabled would the PDK tracing functions actually do anything.

This is mostly just a brain-dump of my thoughts at the moment. I'm happy to discuss further if there is interest in the idea. Thanks!

tasfeature

All 5 comments

+1

There have been talks about doing the same with metrics in the PDK, where different implementations such as Prometheus, Datadog, statsD can pull metrics from the core.
Log serialization(f3653f8ff) recently made it's way into core.

Is this something you would be willing to contribute?

+1 to further enhanced tracing and latency explanations with core Kong. I have always thought a glaring feature missing is information on how long DB queries were taking on a given tx to help explain if the hits being seen were from lua code execution, nginx, or just waiting for db's to respond to given queries(less relevant for db-less setups but the lua code vs nginx trace exec time would still be ideal for those kinda deployments too).

+1

+1

Was this page helpful?
0 / 5 - 0 ratings