Beats: Implement default fallback option when using templates in autodiscover

Created on 16 Jan 2018  路  22Comments  路  Source: elastic/beats

When defining templates in autodiscover, it would be nice to have a default fallback to use when none of them matches, something like this:

filebeat.autodiscover:
  providers:
   - type: docker
     templates:
       - condition:
           contains:
             docker.container.name: "nginx"
         config:
           - module: nginx
             access:
               prospector:
                 type: docker
                 containers.stream: stdout
                 container.ids:
                   - "${data.docker.container.id}"
     # Fallback to docker prospector (without parsing) for unknown containers:
     default:
       - type: docker
         container.ids:
           - "${data.docker.container.id}"
:Processors Integrations Platforms Backlog containers enhancement libbeat

Most helpful comment

Using Filebeat v7.4 I'm getting duplicates with the config below. There is a message containing the decoded JSON fields and another that apparently used the default config and did not decode the JSON. It seems like the default config acts as a "finally" step and is applied whether or not any condition was true for a given container.

filebeat.autodiscover:
  providers:
    - type: docker
      templates:
        - condition:
            equals:
              container.image.name: "mycomponent"
          config:
            - type: container
              paths:
                - /usr/share/filebeat/dockerlogs/${data.docker.container.id}/*.log
              processors:
                - decode_json_fields:
                    fields: ["message"]
                    target: "json"
        - config:
            - type: container
              paths:
                - /usr/share/filebeat/dockerlogs/${data.docker.container.id}/*.log

All 22 comments

馃憤 on this suggestion. An other option for the config would be to have it as following:

filebeat.autodiscover:
  providers:
   - type: docker
     templates:
       - default: true
         config:
           container.ids:
             - "${data.docker.container.id}"
       - condition:
           contains:
             docker.container.name: "nginx"
         config:
           - module: nginx
             access:
               prospector:
                 type: docker
                 containers.stream: stdout
                 container.ids:
                   - "${data.docker.container.id}"
  • I think there should be 1 default per type defined. This allows different defaults for each type and if type:docker is defined multiple times, it would allow one default for each. I think this could allow interesting combinations. I wonder if the indentation in your example above should have been to spaces to the left to be on the same level as providers?
  • The option above would allow to have 1 format for all config options and not have it in 2 places. I'm hoping that simplifies the code.
  • In the above case, there is no "global" fallback if someone uses multiple providers. Do we need that?

Hi!

How would a condition.contains[] is met today?

If it mets only when all the sub-conditions in it turned true i.e. true && met(contains[0]) && met(contains[1]) && ... met(contains[n-1]), I'd be happy w/ multiple levels of defaultings to suffice my needs. For example, I'd want to write:

filebeat.autodiscover:
  providers:
   - type: docker
     templates:
       - # This specific type of containers/pods in this specific namespace has its own log format
         condition:
           contains:
             kubernetes.pod.name: "mixer"
             kubernets.namespace.name: "istio-system"
         config:
           - module: istio-mixer
              log:
                prospector:
                 type: docker
                 containers.stream: stdout
                 container.ids:
                   - "${data.docker.container.id}"
       - # This is the default for the "istio-system" namespace.
          # "Almost" all the containers/pods in this specific namespace would have an uniform log format
         condition:
           contains:
             kubernets.namespace: "istio-system"
         config:
           - module: istio
              log:
                prospector:
                 type: docker
                 containers.stream: stdout
                 container.ids:
                   - "${data.docker.container.id}"
       - # This is the default for our modern apps of `type: docker`. Assume it emits structured logs as the pod is annotated with its modernity.
         condition:
           anyof:
             kuberntes.pod.annotations:
               contains: "i-am-modern-ndjson-logging-app"
         config:
           - module: ndjson-logging-app
             log:
               prospector:
                 type: docker
                 containers.stream: stdout
                 container.ids:
                   - "${data.docker.container.id}"
       - # This is the global default per `type: docker`. We assume it emits non-structure logs
         config:
           - module: mylegacyapp
             log:
               prospector:
                 type: docker
                 containers.stream: stdout
                 container.ids:
                   - "${data.docker.container.id}"

@mumoshu Have been trying to find documentation for the conditionals. Do all of these work?

Hi @rossedman, the configuration seen in my above comment is just a suggestion! It isn't implemented as of today.

Does it look good to you?
I had been willing to contribute a PR once I get some approval and/or support on the suggestion but was unable to do so due to.. the silence 馃槈

Sorry for the late response @mumoshu, I missed your question while going over email :innocent:

The answer is yes, contains match all the fields in the given map :)

@rossedman you can find more info about conditions here: https://www.elastic.co/guide/en/beats/metricbeat/6.2/defining-processors.html#conditions

Hi @exekias
i'm trying to implement OR logic using multiple condition statements in filebeat.yml
but it doesn't work.

       - condition:
           or:
              - contains:
                 docker.container.name: "image1"
              - contains:
                 docker.container.name: "image2"

Is there a way to achieve this without duplicating single condition?

       - condition:
           contains:
             docker.container.name: "image1"


       - condition:
           contains:
             docker.container.name: "image2"

@mvasilenko I tested this and it worked for me: https://gist.github.com/exekias/e802ef376fdbbd4ba5872b57af4128bf

You may want to review your config, "image1" is repeated

Good suggestion. Until something of this kind is implemented, as far as I understand, we are required to pervert like this:

    filebeat.autodiscover:
      providers:
        - type: kubernetes
          templates:
            - condition:
                or:
                  - contains:
                      kubernetes.pod.name: internal-api
                  - contains:
                      kubernetes.pod.name: customer-api
                  - contains:
                      kubernetes.pod.name: exporter
              config:
                - type: docker
                  containers.ids:
                    - "${data.kubernetes.container.id}"
                  multiline.pattern: '^[[:space:]]+((at|\.{3})\b|^Caused by:)?'
                  multiline.negate: false
                  multiline.match: after
            - condition:
                and:
                  - not:
                      contains:
                        kubernetes.pod.name: internal-api
                  - not:
                      contains:
                        kubernetes.pod.name: customer-api
                  - not:
                      contains:
                        kubernetes.pod.name: exporter
              config:
                - type: docker
                  containers.ids:
                    - "${data.kubernetes.container.id}"

https://github.com/elastic/beats/pull/9029 was just merged, which brings the ability to define a configuration without conditions. Conditions are matched in order, so if you put this one at the end it will act as the default, as I think it solves this issue, I'm proceeding to close it.

This is an example config that will be possible with this change:

    filebeat.autodiscover:
      providers:
        - type: kubernetes
          templates:
            - condition:
                or:
                  - contains:
                      kubernetes.pod.name: internal-api
                  - contains:
                      kubernetes.pod.name: customer-api
                  - contains:
                      kubernetes.pod.name: exporter
              config:
                - type: docker
                  containers.ids:
                    - "${data.kubernetes.container.id}"
                  multiline.pattern: '^[[:space:]]+((at|\.{3})\b|^Caused by:)?'
                  multiline.negate: false
                  multiline.match: after
            # Default:
            - config:
                - type: docker
                  containers.ids:
                    - "${data.kubernetes.container.id}"

I tried that configuration with docker containers only and it fails with parsing docker log files multiple times.

Consider the following config:

filebeat.autodiscover:
  providers:
    # Provider for our docker containers
    - type: docker
      templates:
        # Template for the spring boot json logging containers
        - condition:
            contains:
              docker.container.image: myuser/myimage
          config:
            - type: docker
              containers:
                ids:
                  - ${data.docker.container.id}
              encoding: utf-8
              json:
                keys_under_root: true
                add_error_key: true
                message_key: "message"
                overwrite_keys: true
                match: after
              fields:
                log.format.content: "json"
                log.format.layout: "spring-boot"
        - condition:
          config:
            - type: docker
              containers:
                ids:
                  - ${data.docker.container.id}
              encoding: utf-8
              fields:
                log.format.content: "plain"
                log.format.layout: "spring"

When I check this configuration with filebeat 6.6.2, it tells me that the config is ok. When I start this configuration, my expected behavior is:

  1. The log from the container myuser/myimage is using the template for json logging
  2. All other containers are using the default template

What happens is:

  1. The log from the container myuser/myimage is harvested using the given template
  2. The log of all containers including myuser/myimage is harvested using the default template.

Therefore the logs for the container using image myuser/myimage is harvested twice.

Is this the expected behavior? And if so, can the second log stream be suppressed in any way?

Same here. I don't think this is expected behavior.

I tried to use default condition also with filebeat 6.6.2.
My config is as follow :

autodiscover:
        providers:
          - type: kubernetes
            templates:
              - condition:
                  equals:
                    kubernetes.labels.type: java
                config:
                  - type: docker
                    containers.ids:
                      - "${data.kubernetes.container.id}"
                    multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2} ' 
                    multiline.negate: true 
                    multiline.match: after
                    ignore_older: 48h
              - condition:
                  contains:
                    kubernetes.labels.app: nginx
                config:
                  - module: nginx
                    access:
                      input:
                        type: docker
                        containers.stream: stdout
                        containers.ids:
                          - "${data.kubernetes.container.id}"
                        ignore_older: 48h
                    error:
                      input:
                        type: docker
                        containers.stream: stderr
                        containers.ids:
                          - "${data.kubernetes.container.id}"
                        ignore_older: 48h
              - config:
                  - type: docker
                    containers.ids:
                      - "${data.kubernetes.container.id}"
                    ignore_older: 48h

The result is that my logs are stored twice. For example, for java apps which have kubernetes.labels.type: java I have a multiline doc with parsed fields (loglevel, class, etc) AND a raw doc without these fields.

Simple example to reproduce : https://gist.github.com/olivierboudet/796a240577ea7f9fcb5a6f25a7114e6c
In the configuration, I added a condition to ignore all logs from filebeat container.
If I uncomment the last part (ie. the default config), all logs are processed, included those from filebeat container.

Any news on this? Does the unconditional config work or not?

Sorry for the late response. We released default_config setting for hints based autodiscover, check how it works here: https://www.elastic.co/guide/en/beats/filebeat/current/configuration-autodiscover-hints.html#_kubernetes_2

Check hints.default_config setting. I would say it fulfills what we are seeking with this issue

@exekias Using this currently:

filebeat.autodiscover:
  providers:
    - type: kubernetes
      in_cluster: false
      host: ${HOSTNAME}
      kube_config: /home/svcaccount/.kube/config
      hints_enabled: true
      templates:
        - condition:
            contains:
              kubernetes.container.image: redis
          config:
            - module: redis
              log:
                enabled: true
                input:
                  type: container
                  paths:
                    - /var/lib/docker/containers/${data.kubernetes.container.id}/*.log
              slowlog:
                enabled: false
        - config:
            - type: container
              paths: ["/var/lib/docker/containers/${data.kubernetes.container.id}/*.log"]
              exclude_lines: ["^\\s+[\\-`('.|_]"]  # drop asciiart lines

Which seems to work without using hints.default_config. I did not immediately found any duplicates. As I'm actually not really using hints, should I still put:

        - config:
            - type: container
              paths: ["/var/lib/docker/containers/${data.kubernetes.container.id}/*.log"]
              exclude_lines: ["^\\s+[\\-`('.|_]"]  # drop asciiart lines

under hints.default_config: ?

That's interesting. Some users reported these settings were failing for them. Do you have any redis container running? We would need to check that it falls under the redis condition and the default settings are not launched for it

Sorry for the late response. We released default_config setting for hints based autodiscover, check how it works here: https://www.elastic.co/guide/en/beats/filebeat/current/configuration-autodiscover-hints.html#_kubernetes_2

Check hints.default_config setting. I would say it fulfills what we are seeking with this issue

Would you please elaborate, I couldn't get it works :(

Using Filebeat v7.4 I'm getting duplicates with the config below. There is a message containing the decoded JSON fields and another that apparently used the default config and did not decode the JSON. It seems like the default config acts as a "finally" step and is applied whether or not any condition was true for a given container.

filebeat.autodiscover:
  providers:
    - type: docker
      templates:
        - condition:
            equals:
              container.image.name: "mycomponent"
          config:
            - type: container
              paths:
                - /usr/share/filebeat/dockerlogs/${data.docker.container.id}/*.log
              processors:
                - decode_json_fields:
                    fields: ["message"]
                    target: "json"
        - config:
            - type: container
              paths:
                - /usr/share/filebeat/dockerlogs/${data.docker.container.id}/*.log

Is there anyone still working on this? I'm running into the same issue.

edit: running 7.5.2

Is there anyone still working on this? I'm running into the same issue.

edit: running 7.5.2

Ditto. I'm not familiar with the Beats code, but I was looking at the pull request that made conditions in autodiscover optional, https://github.com/elastic/beats/pull/9029/files, and the template.GetConfig method seems to be applying the null condition config regardless of whether another condition is met. I thought the null condition config should only be applied if no other condition were met.

Stil exists with filebeat 7.6.1

Was this page helpful?
0 / 5 - 0 ratings