Vector attempts to provide good defaults, and with this in mind, we could offer a multiline handling logic out of the box.
By the multiline handling logic I mean unifying things like stack-traces that span across multiple lines into a single event.
I do not mean merging partial messages - the messages that are split by the particular underlying source implementations due to exceeding length they support for a single log line. This is a different concern entirely, and it's way more trivial to handle and less controversial.
My proposal is to introduce this at the topology level, like we have an option to add buffers anywhere.
The benefits of this are:
The actual logic will be based on the line_agg that we implemented for the file source. The newly added code will operate on the events emitted from the wrapped source, and merge them. Sort of like a built-in transform.
After this change, we'll be able to remove the line_agg from the file source as it won't be needed anymore.
@MOZGIII as we discussed, I'd like to complete this feature. In general, the team agrees that doing this within this source, like we do in the file source, is best for flexibility, performance, and user experience. We'd like to carry over the same multiline option that exists in the file source to all sources.
You mentioned the need to standardize how sources work with raw data so that we'd have a hook to perform these types of operations before an event is created. If you feel this is necessary could you start with a simple RFC for this change? It does not need to be long. We need to get team buy-in before we make a change like that.
Just dropping a note that rsyslog has a couple of options for handling this, startmsg-regex, endmsg-regex and readMode that could be useful examples for this.
Closing this in favor of individual issues like https://github.com/timberio/vector/issues/3307.
Most helpful comment
Just dropping a note that rsyslog has a couple of options for handling this,
startmsg-regex,endmsg-regexandreadModethat could be useful examples for this.