Beats: [Discuss] Refactor How libbeat features are activated for individual beats.

Created on 14 May 2018  路  9Comments  路  Source: elastic/beats

When I was working on a function agent, I found the need to have an ephemeral beat, I mean a beat that doesn't write anything to disk, or executes any disk related actions, like creating new directories.

A custom beat developer should be able to tell libbeat what features to activate; we could provide a default sets of
capabilities.

This would mean we could break the core into these capabilities:

  • Keystore
  • Seccomp
  • Reporting
  • Persisted (or ephemeral data, write log to disk, create data folder)
  • Install Dashboards
  • Install Ingest Pipeline
  • Install ML Job
  • LogSystemInfo

By default beats author have access to all the capabilities:

#

Advantages

  • Tailored experience, use what you need
  • Reduce the complexity of cmd/instance/beat.go
  • Easier testing
  • Force isolation.
  • Possibility of better logger
  • Faster startup time

Possible interface

func init() {
  c := capabilities.List() # return the currently active capabilities
  if err := capabilities.Clear(); err != nil {
    // capabilities already activated
  }

  // Since capabilities might have have dependency, we let the capabilities package handle that.
  // first version could be hardcoded execution
  if err := capabilities.Enable(capabilities.Seccomp); err != nil {
    //.. panic could not enable capabilities
    // require X capabilities or not on linux.
  }
  capabilities.Enable(capabilities.Ephemeral)
}

This might be complicated if we merge multiple beats into one, we could solve this by adding the beat canonical name to the Enable method and do some validation at startup to make sure the capabilities are compatible.

Each capability would be responsable to deal with validating his configuration.

Integrations Inbox discuss libbeat refactoring

Most helpful comment

I like the idea from @kvch Instead of having to modify the vendor directory it could be an include.go file which is generated similar to what we do for outputs or modules based on a yaml file like @kvch proposed above. So libbeat itself would not have an include directory anymore to load all outputs as an example but it would be up to each beat. It would required that all listed components above are built in a plugin like way. Having this would also lead to potentially smaller binaries as for example kubernetes and docker processor could be excluded.

All 9 comments

We could also add a hook for config transformer, which takes an existing configuration and try to set other config options, the code dealing with the cloud_id could be the first one.

I believe, the downside of this interface, is the capabilities would still be part of the binary, we might want to change that to make sure they are excluded at compile time.

As @ruflin pointed out, we might want to have something similar to the outputs or processors.

What about adding a configuration file to the generators which specifies which parts of libbeat should be copied to the vendor folder of a new Beat? Also, we could still keep the command line flags and prompting for answers when we need them.

capabilites:
 - input/file
 - processors:
    - drop_event
 - output/kafka

This way everyone could generate a small Beat which satisfies their needs. However, I rather see my idea as a dream. I am not sure if it's feasible.

@kvch <3 I think for simplicity it could a be a generated go file with the default and let the users opt-in or opt-out, since this is more an advanced use case I would not add anything to the generator.

I like the idea from @kvch Instead of having to modify the vendor directory it could be an include.go file which is generated similar to what we do for outputs or modules based on a yaml file like @kvch proposed above. So libbeat itself would not have an include directory anymore to load all outputs as an example but it would be up to each beat. It would required that all listed components above are built in a plugin like way. Having this would also lead to potentially smaller binaries as for example kubernetes and docker processor could be excluded.

@kvch @ruflin I will work on a more formal plan for that, this issue was from my current need but now I realize it's something we want elsewhere.

I did some progress on this which look like a nice way of doing it and allow enough flexibility and at the same time we can remove the init, syntax look like this.

module.Bundle(
    bootstrap.ModuleLogSystemInfo,
    elasticsearch.Module,
    logstash.Module,
    kafka.Module,
),

And the plugin is defined like this.

var Module = outputs.Module("logstash", create)

This is build from @urso works on plugin.

Advatanges:

  • All plugins use the same registry.
  • Only compile what you need.
  • go plugin use the same syntax, no changes.

Bundle will merge the definition, so we can have something like this.

var DefaultModules = module.Bundle(
        elasticsearch.Module,
    logstash.Module,
)

module.Bundle(
    DefaultModules,
    http.Module,
),

ping @urso

Was this page helpful?
0 / 5 - 0 ratings

Related issues

dedemorton picture dedemorton  路  3Comments

ycombinator picture ycombinator  路  3Comments

andrewkroh picture andrewkroh  路  3Comments

ptrlv picture ptrlv  路  3Comments

marian-craciunescu picture marian-craciunescu  路  3Comments