In this issue I want to discuss are there still any real benefits in using Atoms as key values in LogEvent instead of Strings.
It seems to me that switching to BTreeMap (which uses actual strings comparisons instead of comparing hashes) and introduction of default log schemas (https://github.com/timberio/vector/pull/1769, which reduced use of static atoms) greately reduced benefits of using Atoms instead of plain Strings. On the other hand, maintaining code working with Strings directly is easier and some invocations of clone over the codebase could potentially be avoided with them.
This is a great question! They definitely add a bit of friction to the development process. Their clone is very cheap, so I wouldn't worry about that too much.
I'd be curious to see the effects on throughput due to memory use (duplicate string keys on the heap) and CPU (interning and reusing atoms vs allocating and deallocating strings). With some numbers, we'd be able to weigh ergonomics in the code vs any performance differences.
@a-rodin Would using Strings ease integration with Lua and Javascript transforms? Seems like it would for WASM/FFI in general
Some simple benchmarks show that we get about 6-7% more throughput by removing Atom in favor of std::string::String.
Better performance and better ergonomics make this a clear win, so we should do it.
Most helpful comment
Some simple benchmarks show that we get about 6-7% more throughput by removing
Atomin favor ofstd::string::String.Better performance and better ergonomics make this a clear win, so we should do it.