Text copied from https://discuss.elastic.co/t/map-files-to-pipelines/55063
I'm playing around with the new Elastic5 products and am interested in using the ingest node feature with Filebeat as the agent. I have a server that runs several applications with different log files/formats and it seems that per Filebeat agent you can only define a single Elasticsearch output that references a single pipeline.
It would be nice if it was possible to define a mapping of log files to output pipelines instead of having to maintain one big pipeline containing all of the grok patterns. I feel it could get messy trying to keep track of all of the patterns and which application's logs they are matching.
Something obvious I am missing or will I have to maintain a large list of grok patterns in one pipeline?
Thanks
I am second that, exactly for the reasons mentioned by @sirstevepal.
I may be misunderstanding this issue but wouldn't it be possible to run several Filebeats (one per log file format) and that way making sure that each log file type is being sent to the right pipeline?
@Shugyousha In my particular case the environment is resource-constrained and ideally I would like to use one filebeat process using only one CPU (a specific one, taskset) for ingesting all log files.
We recently introduced format string support for pipelines. I assume this should solve this issue? https://www.elastic.co/guide/en/beats/filebeat/5.0/elasticsearch-output.html#_pipeline For detailed docs about format string check out the example for the index format string. @urso Perhaps you can add some more details on this?
On Fri, Nov 11, 2016 at 12:28 PM, Nicolas Ruflin
[email protected] wrote:
We recently introduced format string support for pipelines. I assume this
should solve this issue?
https://www.elastic.co/guide/en/beats/filebeat/5.0/elasticsearch-output.html#_pipeline
I found this option as well when searching for a possible solution to
this issue but I wasn't sure how this would actually look in practice.
Having an example there would help a lot.
For detailed docs about format string check out the example for the index
format string.
As someone who hasn't worked with ElasticSearch too much yet it's not
clear which conditionals are meant in the description of the option
and how you would have to format the array to make these conditionals
apply to the right pipelines.
If you could send me a link to the pertaining configuration concept
(i.e. the conditionals) I can try to come up with an example for the
usage of this option for inclusion into the documentation.
pipelines can be extracted directly extracted using format strings (but format strings can only access event fields, no processing to get the source basename).
For an example with conditionals see indices setting. Pipelines works the same, just replaces indices with pipelines and index with pipeline. Conditionals are the same as available for processors: https://www.elastic.co/guide/en/beats/filebeat/5.0/configuration-processors.html#filtering-condition
We've got the exact functionality for indices and pipelines in a few places, but docs are not fully up-to date yet (due to introducing all the duplications, docs might need some more restructuring.).
Alternatively, instead of relying on conditonals, one can define a prospector per source and use the fields setting to set a custom pipeline parameter per prospector. e.g.
filebeat.prospectors:
- ...
fields.pipeline: pipeline1
- ...
fields.pipeline: pipeline2
output.elasticsearch:
pipeline: '%{[fields.pipeline]}'
@urso I just tested such a configuration and it works. Thanks a lot!
One thing I noticed when using '%{[fields.pipeline]}' format string for the pipeline is that when field fields.pipeline is missing filebeat does not report any error and seems to upload and function just fine. However, I was not able to find the uploaded data in Elastic Search (may be I did not look hard enough).
On Fri, Nov 11, 2016 at 1:48 PM, Steffen Siering
[email protected] wrote:
We've got the exact functionality for indices and pipelines in a few places,
but docs are not fully up-to date yet (due to introducing all the
duplications, docs might need some more restructuring.).Alternatively, instead of relying on conditonals, one can define a
prospector per source and use the fields setting to set a custom pipeline
parameter per prospector. e.g.filebeat.prospectors:
- ...
fields.pipeline: pipeline1- ...
fields.pipeline: pipeline2output.elasticsearch:
pipeline: '%{[fields.pipeline]}'
Thanks, that makes it a lot clearer!
Is there a way for people outside of the project to update the
documentation or send a PR for it?
@Shugyousha Sure. All the docs are in the project specific "docs" directory. For "pipelines" this is for example here: https://github.com/elastic/beats/blob/98e1aef77be9a44e79e21563bddea0ccd2f94689/libbeat/docs/outputconfig.asciidoc
@max0x7ba If not fields can be found, it falls back to pipeline. If there is no pipeline, no pipeline is used and it should be just added to the index.
@ruflin That explains my observations, thank you.
I opened a pull request for the documentation: https://github.com/elastic/beats/pull/3010
Feedback welcome!
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue doesn't have a Team:<team> label.
Most helpful comment
pipelines can be extracted directly extracted using format strings (but format strings can only access event fields, no processing to get the source basename).
For an example with conditionals see indices setting. Pipelines works the same, just replaces
indiceswithpipelinesandindexwithpipeline. Conditionals are the same as available for processors: https://www.elastic.co/guide/en/beats/filebeat/5.0/configuration-processors.html#filtering-conditionWe've got the exact functionality for indices and pipelines in a few places, but docs are not fully up-to date yet (due to introducing all the duplications, docs might need some more restructuring.).
Alternatively, instead of relying on conditonals, one can define a prospector per source and use the fields setting to set a custom pipeline parameter per prospector. e.g.