Beats: Meta ticket: Filebeat modules

Created on 9 Dec 2016  路  15Comments  路  Source: elastic/beats

This is the meta ticket for the Filebeat modules implementation.

TODOs and progress:

  • [x] #3158 Add a sample module (NGINX)
  • [x] #3158 Prototype module loading
  • [x] #3195 Add support for multiple paths on the same OS in the Nginx module
  • [x] #3171 Add sample module for Mysql
  • [x] #3191 Add sample module for Syslog
  • [x] #3214 Add system tests for the modules
  • [x] #3221 Move Kibana dashboards at the module level
  • [x] #3248 Add a module generator
  • [x] #3256 Apache2 module
  • [x] #3333 Phase 1 of the go implementation: create prospector from the modules
  • [x] #3394 Load the pipeline automatically
  • [x] #3433 Make the pipeline configurable from the prospector config
  • [x] #3472 Pass the pipeline ID from modules automatically
  • [x] #3436 Include the module files in the filebeat packages
  • [x] #3506 Replace the import_dashboard program with -setup
  • [x] #3516 Add the Beat version in the pipeline ID
  • [x] #3522 Only insert the pipeline if it's not already loaded
  • [x] #3524 Test the windows versions of the Filebeat modules
  • [x] #3540 Replace fields.source_type with fileset.module and fileset.name
  • [x] #3592 Docs: overview & tutorial
  • [x] #3598 Docs: create & gather per module docs and configuration samples
  • [x] #3616 Docs: Add a guide for creating modules
  • Docs: Add Logstash equivalent configurations. Tracked in: https://github.com/elastic/logstash/issues/6542

Overview

Filebeat modules are "packages" of the required configurations and logic to ship and analyze log files from common services. A typical module (e.g nginx) is composed of several filesets, one for each type of logs (e.g. access and error for nginx). A typical fileset contains the following:

  • filebeat prospector configuration
  • Elasticsearch Ingest Node pipeline definition
  • Fields definitions and docs
  • Kibana dashboards (at the module level)
  • Test log files
  • A manifest.yml file with overwritable variables and logic to select the right files

At the moment, Filebeat modules are strictly configuration files and templates, no actual Go code. This makes it easy to create new module. Eventually we might have some of the modules include Go code/plugins for more specific needs.

Filebeat has code to evaluate the variables from the manifest.yml file, interpret the templates, load the Ingest Node pipeline into Elasticsearch, and the Kibana dashboards into Kibana.

User interaction / tutorial

The easiest way to ship & parse the Nginx logs would be to start Filebeat like this:

filebeat -e -modules=nginx -setup

(The -e is for sending the output to stdout)

The -setup flag instructs Filebeat to load the Kibana dashboards on startup. After that is done and you can see data in Kibana, you can restart Filebeat without the -setup flag:

filebeat -e -modules=nginx

You can also start multiple modules at once:

filebeat -e -modules=nginx,mysql,system

If you prefer the configuration file, you can add the following to it, which is the equivalent of the above:

modules:
- name: nginx
- name: mysql
- name: syslog

Then start Filebeat simply with filebeat -e.

Variable overrides

Each fileset has a set of "variables" defined in the manifest.yml file, which allow a first level of configuring the module. For example, most modules allow setting custom paths where to find the log files. For example, to adjust where the access log are you can type:

filebeat -e -modules nginx -M "nginx.access.var.paths=[/opt/apache2/logs/access.log*]"

Or via the configuration file:

modules:
- name: nginx
  access:
    var.paths = ["/opt/apache2/logs/access.log*"]

Advanced settings

Behind the scenes, each module starts a Filebeat prospector. For advanced users, it's possible to add or overwrite any of the prospector settings. For example, enabling close_eof can be done like this:

modules:
- name: nginx
  access:
    prospector.close_eof: true

Or like this:

filebeat -e -modules=nginx -M "nginx.access.prospector.close_eof=true"

From the CLI, it's possible to change variables or settings for multiple modules/fileset at once. For example, the following works and will enable close_eof for all the filesets in the nginx module:

filebeat -e -modules=nginx -M "nginx.*.prospector.close_eof=true"

The following also works and will enable close_eof for all prospectors created by modules:

filebeat -e -modules=nginx,mysql -M "*.*.prospector.close_eof=true"
Filebeat meta

All 15 comments

Nice work on parsing the Mysql slow query log! I thought I would share a alternative approach to ingesting the slow query log in case it's helpful - https://gist.github.com/mjpowersjr/740a9583e9ec8b49e0a3

@mjpowersjr thanks for the hint, we could add the CSV format as an option to the module. That's actually a good example for a feature I was planning.

Are you going to expose this like community beats?

It would be really cool to have a cookiecutter to create filebeat modules that could be installed like elasticsearch plugins?

@blacktop we do plan to support modules that live outside the beats repository. Not sure if it's going to be in the first version, though.

Thanks for the overview update @tsg!

Does it make sense for Filebeat to actually run/collect when you use setup mode? Maybe it would make more sense for it to just exit and then you run without the setup flag. Does running in setup multiple times break anything?

I'm also curious about what happens when you start without setup or with a mixed state of setups. Does it know what's been setup and can it error or does it do something else? Like this:

> filebeat -e -modules nginx -setup
[stop/kill]
> filebeat -e -modules nginx,mysql

I'm ok if this is too edge-case for now, just curious what will happen.

@brandonmensing good questions! I didn't yet make my mind about this. I'm thinking of two options:

1.-setup only loads the dashboards, and does nothing else (i.e. the strict equivalent of todays import_dashboards). The pipeline loading, since it's a fairly quick and not risky operation, can be done every time we connect to Elasticsearch, similarly to how we do it for the template. The drawback of this is that if you run to Beats with different versions of the same pipeline (e.g. because of different versions), they will continuously overwrite each other. Eventually we'll solve this via index and pipeline versioning.

2.-setup loads both dashboards and pipeline by default, and it's the operator responsibility to call it at the right time. If they don't run with the correct -setup args, indexing will fail (that's better than not loading the template, when indexing succeeds but in the wrong format). My idea about this is the people will run with -setup on one of their servers, and without -setup on the others. On upgrades, they should upgrade first the -setup server and then the others.

Awesome, I like #2 better because it feels more likely that the user will figure out to do the right thing (we give them a warning for having not done the setup).

@brandonmensing I went for option 2 in #3394, because it was simpler to implement and you suggested it, but to be honest I think eventually we'll want to load the pipelines automatically to avoid common user errors.

The implementation of option 1 was more difficult because we don't have a way for the modules/prospector to register a callback in the ES output module (which we'd need to execute the load at connection time).

Filebeat modules are awesome !
It will allow to ingest new logs in seconds, with collecting, parsing, storing, and visualizing all packaged, up and running !

In which Filebeat version do you plan to include this magic new feature ?

@fbaligand Thank you, equally excited here :) We're targeting 5.3, which should be released around the beginning of March. By then it should have all the basic functionality, and hopefully a couple more modules as well.

Great news !
I'm impatient to play with it ! :)

@fbaligand There are nightly builds, just in case ;-) https://beats-nightlies.s3.amazonaws.com/index.html?prefix=filebeat/

Interesting ! I didn't know it !
Finally, I won't wait until mars to play with it ;)

This modules idea sound very good. I can't wait to use them. Kudos.

First phase, which we'll include as beta in 5.3, should be pretty much done.

Was this page helpful?
0 / 5 - 0 ratings