vector 0.10.0 (g0f0311a x86_64-apple-darwin 2020-07-22)
[sources.app_log]
type = "http"
address = "0.0.0.0:9091"
encoding = "json"
[sinks.to_file]
type = "file"
inputs = ["app_log"]
path = "/tmp/vector.log"
encoding.codec = "ndjson"
https://gist.github.com/awangc/c09cd11f75a3ed17bcf201e2bad580e0
A warning/log line that test and test2.child fields have been overwritten (see the gist above)
The fields test and test2.child were silently overwritten
See the curl command from the gist above
After some digging around with @awangc, the combination of the http source automatically dedot the keys by using log::insert over log::insert_flat, and BTreeMap::insert replaces existing keys is causing this behavior. I wonder if it's the right behavior to automatically dedot the dotted JSON keys at the source. Maybe it's better to leave the keys as is, and have dedot be a separate transform that can be opt in later in the pipeline?
Thanks @drunkirishcoder. We're working through data model changes that should simplify and clarify this work. @Hoverbear is leading the charge there. You can see the first iteration of this work in her data model RFC.
I think in this case the proper behavior is pretty clear and we should not be restructuring data that we're explicitly passed as JSON. Hopefully the fix is as simple as replacing insert with insert_flat and the RFC will help prevent issues like this in the future.
I can try it out if we agree that is the correct behavior.
@drunkirishcoder That'd be so cool! We'd love a PR and you're free to ping us here or on https://discord.gg/phwDQpd if you encounter any problems. :)
Finally able to get this submitted. @Hoverbear
Most helpful comment
Thanks @drunkirishcoder. We're working through data model changes that should simplify and clarify this work. @Hoverbear is leading the charge there. You can see the first iteration of this work in her data model RFC.