It looks like there is a possible memory leak in tracing-subscriber 0.1.6 which we currently use https://github.com/tokio-rs/tracing/issues/515. We should upgrade to 0.2.0-alpha.5 which should just work out of the box.
That said, we should still investigate how these changes impacted vector. In general, I would like to see us run a memory profiling test that uses tcp in and has a high amount of concurrent tcpstreams that recycle. This would create many new spans and could trigger some of this behavior. We should also ensure we run these tests at the highest log verbosity to push the tracing system to its limits.
I did some preliminary testing this morning but uncovered nothing.
Reference https://github.com/tokio-rs/tracing/commit/3c35048ba7931804fb3d68a1226d6c01ffe1fb31 and https://github.com/tokio-rs/tracing/issues/515
I'm curious how this could have been avoided? In general, I'm not a fan of relying on bleeding edge libraries like this.
In regards to testing, this sounds like something we can add to our test harness.
@binarylogic I'm not sure, I have not actually noticed this specific behavior at all in my testing. So I think the best way to actually start finding things like this is having multiple long running instances of vector that do things like recycle connections etc. I think that would be the best way to observe these types of things.
This specific issue I don't think we are hitting it anyways because we don't do that many deeply nested spans. For the bleeding edge, in general rust's logging/observability library space is still kind of lacking in general. So yes, we may run into issue but there is not much choice here.
Sure, if we're going to put in the effort to test for this we should do it in a way that prevents regression. The test harness allows for long-running tests.
Most helpful comment
@binarylogic I'm not sure, I have not actually noticed this specific behavior at all in my testing. So I think the best way to actually start finding things like this is having multiple long running instances of vector that do things like recycle connections etc. I think that would be the best way to observe these types of things.
This specific issue I don't think we are hitting it anyways because we don't do that many deeply nested spans. For the bleeding edge, in general rust's logging/observability library space is still kind of lacking in general. So yes, we may run into issue but there is not much choice here.