Any chance to implement multiline for more sources. I'm specifically thinking of journald but I'm thinking there might be a way to apply it more generically to all (or a class) or input.
Thanks for reaching out!
We have the support for automatic partial message merging for journald when used with docker planned, ref https://github.com/timberio/vector/issues/1719. This is a little bit different compared to adding the support for generic multiline parsing configuration.
I'll be working this week on adding the same multiline parsing capability that we have at the file source to the journald source.
I general, we probably should just add a transform for line aggregations like at the file source.
I've switched focus to k8s integration, and I won't be implementing this in the near future. Just letting you know to be transparent.
So multiline is only available for file? Is this because any other source does not suffer from this? I'm facing multiline logs parsing issues with docker source.
@pySilver could you expand on the issues you're facing. The docker source should "just work" with multi-line splitting. This assuming you're dealing with the max-length splitting that Docker does itself. If you're looking for something more advanced/custom, I'd recommend reading this guide. Otherwise, we would love to learn more about your use case so we can discuss how to properly solve it. Multi-line merging is something we want Vector to be good at.
@binarylogic hum, maybe I'm doing something wrong?
Here is my sample docker logs output for app container:
[2020-06-19 07:33:20,860: ERROR/ForkPoolWorker-1] {'categories': ['This field cannot be blank.']}
Traceback (most recent call last):
File "/app/abc/merchants/models.py", line 764, in sink_products
product = Product.validate_data(product_data)
File "/app/abc/catalog/models.py", line 1374, in validate_data
product.full_clean(
File "/opt/venv/lib/python3.8/site-packages/django/db/models/base.py", line 1222, in full_clean
raise ValidationError(errors)
django.core.exceptions.ValidationError: {'categories': ['This field cannot be blank.']}
[2020-06-19 07:33:27,753: ERROR/ForkPoolWorker-1] {'categories': ['This field cannot be blank.']}
Traceback (most recent call last):
File "/app/abc/merchants/models.py", line 764, in sink_products
product = Product.validate_data(product_data)
File "/app/abc/catalog/models.py", line 1374, in validate_data
product.full_clean(
File "/opt/venv/lib/python3.8/site-packages/django/db/models/base.py", line 1222, in full_clean
raise ValidationError(errors)
django.core.exceptions.ValidationError: {'categories': ['This field cannot be blank.']}
[2020-06-19 07:33:38,871: INFO/MainProcess] Received task: abc.merchants.tasks.sink_to_raw_products[e60da0e3-6260-40b8-a0dc-c6647bb336e2]
I'm simply sinking this into Loki where each line became separated log entry for some reason. I will test it one more time and get back here.
Thanks for the data. Let us try to recreate. If this is coming in the throuhg the docker source it should only be 3 events. That's what you're expecting, correct?
@binarylogic exactly, this is why initially I've tried to apply multiline option, but it turned out there is no such option in docker input plugin.
Docker source only merges partial messages out of the box (the ones that are split by the docker itself). In this case, the input to the docker is split across multiple lines, thus docker source counters the effects of docker internal message splitting (the effects that are introduced by the fact that the logs are being passed through docker), but preserves the input as it was - split into multiple lines.
The proper solution here would be to extract the line_agg from the file source into a transform, and it's what was planned - but it slipped through somehow.
@MOZGIII so, if I get you right, this would be possible to solve this via transform some time soon? Any other workarounds until then? I was thinking if I use docker log driver like syslog or journald, would this solve the multiline issue?
btw, this is kinda strange that journald storage is not working in docker.
@pySilver it's more likely that we'll add the multiline option across all relevant sources, including docker and journald.
great to hear that. Until then I'll probably use a dirty hack by redirecting this exact app logs into file sink and read them back with another file source.
@karlseguin thanks again for reporting this. I'm closing this in favor of #2892 since we're beginning work there.