Describe the bug
There is a bug in envoy that results in incorrect Prometheus statistics output. The bug is outlined here: https://github.com/envoyproxy/envoy/issues/10073
This has been fixed in a recent PR that was merged to envoy upstream master: https://github.com/envoyproxy/envoy/pull/10833
This bug is causing our Datadog dashboards to display 'incorrect' information as Datadog expects the prometheus format to be valid. Ambassador simply exposes the raw envoy output so fixing the underlying envoy bug would fix the issue in Ambassador.
To Reproduce
An example of the envoy prometheus output is available in the envoy bug: https://github.com/envoyproxy/envoy/pull/10833
For Ambassador: view the raw prometheus /metrics output and see that the groupings are not consistent for individual stat types.
Expected behavior
I would like the recent commits fixing this problem to be backported to the datawire/envoy branch so that future releases of Ambassador will no longer have this prometheus bug and we will see correct metrics in Datadog.
Versions (please complete the following information):
Quick update: the PR that introduces this fix in envoy was merged on April 24, 2020. The latest releases (1.14.1) was released on April 8, 2020, and so does not include it. It looks like they plan to release it with 1.15.0.
This will be fixed when 1.7.0 ships (in the coming days), which includes an upgrade to Envoy 1.15.
This fix is in 1.7.0, which is now available.
Most helpful comment
This will be fixed when 1.7.0 ships (in the coming days), which includes an upgrade to Envoy 1.15.