Beats: Kubernetes auto-discover does not play well with container annotations

Created on 22 Dec 2018  路  26Comments  路  Source: elastic/beats

Configure filebeat with Kubernetes autodiscover with hints enabled e.g.:

    filebeat.autodiscover:
      providers:
        - type: kubernetes
          hints.enabled: true

Deploy an nginx or other container into Kubernetes that writes its logs to standard out using the "access" log format.

On this container, add the annotation:

co.elastic.logs/fileset.stdout: access

Filebeat will now be unable to read logs from these contaniers, failing with errors like:

2018-12-04T14:26:29.853Z    ERROR   log/harvester.go:281    Read line error: invalid CRI log format; File: /var/lib/docker/containers/4da60db6721a545e448eb42732a57372d239b3882397a472e394504f9a448537/4da60db6721a545e448eb42732a57372d239b3882397a472e394504f9a448537-json.log

and

2018-12-04T02:32:18.636Z    ERROR   log/harvester.go:281    Read line error: parsing CRI timestamp: parsing time "3.chunk.js" as "2006-01-02T15:04:05Z07:00": cannot parse "unk.js" as "2006"; File: /var/lib/docker/containers/2a7d435274655d581cbdba63c8477042a30583c7cb84c9b2ef5a4c5af9bd2da7/2a7d435274655d581cbdba63c8477042a30583c7cb84c9b2ef5a4c5af9bd2da7-json.log

I believe the problem is that the annotation is causing Filebeat to attempt to parse the entire log as an access log, instead of just the relevant part of the Docker JSON actually being read, but this is just a guess based on the observed behavior.

Integrations bug containers

Most helpful comment

Here is the code line producing the bug: https://github.com/elastic/beats/blob/master/libbeat/reader/readjson/docker_json.go#L200

When the reader skips a log line, offset isn't considered.

All 26 comments

same issue here - filebeat 6.4.x, x>1,

Read line error: parsing CRI timestamp: parsing time ""[2018-11-28" as "2006-0102T15:04:05Z07:00": cannot parse ""[2018-11-28" as "2006"

filebeat is running on docker host, without kubernetes, but with filebeat.autodiscover enabled:

filebeat.yml

filebeat.autodiscover:
  providers:
   - type: docker
     templates:
       - condition.or:
           - contains.docker.container.name: "nginx"
         config:
           - module: nginx
             access:
               prospector:
                 type: docker
                 containers.stream: stdout
                 containers.ids:
                   - "${data.docker.container.id}"
                 exclude_lines: ['exclude', 'lines' ]
         error:
               prospector:
                 type: docker
                 containers.stream: stderr
                 containers.ids:
                   - "${data.docker.container.id}"

solved temporary by downgrading to 6.4.0, removing /var/lib/filebeat/registry and restarting filebeat

I will have a look into this, currently trying to reproduce

I have the exact same error with multiple filebeat versions : 6.4.0, 6.4.2, 6.5.4. I tried each version after deleting the registry.

I tried with docker autodiscover and kubernetes autodiscover, and with or without cri.flags=false and combine_partial=false. The issue is always here.

Is there a way to just disable CRI ?

Also having this issue. With both 6.5.1 and 6.5.4.

I tried also with 7.0.0-alpha2 and I reproduce the issue. FYI, I do not enable hints.
This is my configuration :

filebeat:
      config:
        modules:
          - module: nginx
            access:
              enabled: true
            error:
              enabled: true
      autodiscover:
        providers:
          - type: kubernetes
            templates:
              - condition:
                  equals:
                    kubernetes.labels.type: "java"
                config:
                  - type: docker
                    containers.ids:
                      - "${data.kubernetes.container.id}"
                    multiline.pattern: '^\d{4}-\d{2}-\d{2} ' 
                    multiline.negate: true 
                    multiline.match: after
              - condition:
                  equals:
                    kubernetes.container.name: "nginx-ingress-controller"
                config:
                  - module: nginx
                    access:
                      input:
                        type: docker
                        containers.stream: stdout
                        containers.ids:
                          - "${data.kubernetes.container.id}"
                        processors: 
                          - add_kubernetes_metadata: 
                              in_cluster: true 
                    error:
                      input:
                        type: docker
                        containers.stream: stderr
                        containers.ids:
                          - "${data.kubernetes.container.id}"
                        processors: 
                          - add_kubernetes_metadata: 
                              in_cluster: true 
              - config:
                  - type: docker
                    containers.ids:
                      - "${data.kubernetes.container.id}"
                    processors: 
                      - add_kubernetes_metadata: 
                          in_cluster: true 
    setup:
      template:
        name: "filebeat"
        pattern: "filebeat-*"

    output.elasticsearch:
      hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']  
      index: "%{[kubernetes.namespace]:nonamespace}-filebeat-%{+yyyy.MM.dd}"
      pipelines:
        - pipeline: java-logs-pipeline
          when.equals:
            kubernetes.labels.type: java
        - pipeline: mongodb-logs-pipeline
          when.equals:
            kubernetes.labels.app: mongo-pod

I am wondering if the CRI parsing error is not just on nginx-ingress-controller logs. Each time I check which container is reponsible of this error, it was a nginx ingress controller.

I removed the whole condition for nginx-ingress-controller and it look like better, I do not have the error after restarting filebeat. I check this during few hours and I let you know.

I removed this part (running 7.0.0-alpha2 at the moment) :

- condition:
                  equals:
                    kubernetes.container.name: "nginx-ingress-controller"
                config:
                  - module: nginx
                    access:
                      input:
                        type: docker
                        containers.stream: stdout
                        containers.ids:
                          - "${data.kubernetes.container.id}"
                        processors: 
                          - add_kubernetes_metadata: 
                              in_cluster: true 
                    error:
                      input:
                        type: docker
                        containers.stream: stderr
                        containers.ids:
                          - "${data.kubernetes.container.id}"
                        processors: 
                          - add_kubernetes_metadata: 
                              in_cluster: true 

24 hours without seeing the issue, I think it confirms that the problem is when using nginx module.

The same issue. My config:
# docker stdout - type: docker containers: path: "/var/lib/docker/containers" stream: "stdout" ids: "*" ignore_older: 2h
I have many "Read line error: invalid CRI log format" errors. Offsets in registry file are corrupted, pointing in the middle of JSON. That's why filebeat thinks that the log isn't docker JSON, but container.d CRI(the first character isn't "{").

The issue isn't reproducing on a single file, but only on many. Especially when restarting filebeat.
With a clean registry, it runs without errors for a couple of hours.
Tried 6.4.2, 6.5.1, 6.5.4.

We have discovered the source of this issue. Filebeat has different offsets for each stream.
Let's say we have two lines:

{"log":"1111111111111\n","stream":"stdout","time":"2019-01-18T10:21:21.826206703Z"}
{"log":"2\n","stream":"stderr","time":"2019-01-18T10:21:21.826260281Z"}

Also we have a config with stream "stderr".
After filebeat processes the data, the offset in the registry will be 72(first line is skipped). But the right value is 155.
When filebeat rereads registry offset will point in the middle of the first line.

Don't use stream filtering to avoid errors. stream: all works fine :)

Here is the code line producing the bug: https://github.com/elastic/beats/blob/master/libbeat/reader/readjson/docker_json.go#L200

When the reader skips a log line, offset isn't considered.

Thank you everyone! I'll work on a PR fixing this

PR is up, I will be doing some testing: https://github.com/elastic/beats/pull/10211

Fix has been merged to master, I've built a snapshot docker image of Filebeat 6.5 with this patch: exekias/filebeat:6.5-snapshot, just in case anyone is willing to try it and provide feedback. Once fix is confirmed, we will get it backported

This fix solved the issue for me, I've updated filebeat and restarted nginx-controller pods, which logs are being filtered by stream in my config.

Thanks!

I tested and confirm that using https://hub.docker.com/r/exekias/filebeat/tags image: exekias/filebeat:6.5-snapshot on Filebeat DaemonSet definition, solved the issue.

  • minikube version: v0.33.1

Kubernetes deployment definition using image: nginx:1.14-alpine

annotations:
       co.elastic.logs/module: nginx
       co.elastic.logs/fileset.stdout: access
       co.elastic.logs/fileset.stderr: error

Thanks! @exekias @baldomeromero

@exekias @baldomeromero I have attached my filebeat which was taken from https://github.com/elastic/beats/blob/6.5/deploy/kubernetes/filebeat-kubernetes.yaml + Added ES endpoint with creds + disabled filebeat.config.inputs + enabled autodiscover + added annotations.
I was hoping the NGINX logs would get parsed, but they don't seem to be. Can you take a look or post how yours looks like?

Image: docker.elastic.co/beats/filebeat:6.5.1
autodiscover:

    filebeat.autodiscover:
      providers:
        - type: kubernetes
          hints.enabled: true

annotations:

spec:
  template:
    metadata:
      labels:
        app: filebeat
      annotations:
        co.elastic.logs/module: nginx
        co.elastic.logs/fileset.stdout: access
        co.elastic.logs/fileset.stderr: error

Complete filebeat-kubernetes.yaml

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: filebeat-config
  namespace: kube-system
  labels:
    app: filebeat
data:
  filebeat.yml: |-
    filebeat.config:
      #inputs:
        # Mounted `filebeat-inputs` configmap:
        # path: ${path.config}/inputs.d/*.yml
        # Reload inputs configs as they change:
        # reload.enabled: false
      modules:
        path: ${path.config}/modules.d/*.yml
        # Reload module configs as they change:
        reload.enabled: false

    # To enable hints based autodiscover, remove `filebeat.config.inputs` configuration and uncomment this:
    filebeat.autodiscover:
      providers:
        - type: kubernetes
          hints.enabled: true

    processors:
      - add_cloud_metadata:

    cloud.id: ${ELASTIC_CLOUD_ID}
    cloud.auth: ${ELASTIC_CLOUD_AUTH}

    output.elasticsearch:
      hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
      protocol: ${ELASTICSEARCH_PROTOCOL}
      #index: ${ELASTICSEARCH_INDEX}
      username: ${ELASTICSEARCH_USERNAME}
      password: ${ELASTICSEARCH_PASSWORD}
    setup.template:
      enabled: true
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: filebeat-inputs
  namespace: kube-system
  labels:
    app: filebeat
data:
  kubernetes.yml: |-
    - type: docker
      containers.ids:
      - "*"
      processors:
        - add_kubernetes_metadata:
            in_cluster: true
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: filebeat
  namespace: kube-system
  labels:
    app: filebeat
spec:
  template:
    metadata:
      labels:
        app: filebeat
      annotations:
        co.elastic.logs/module: nginx
        co.elastic.logs/fileset.stdout: access
        co.elastic.logs/fileset.stderr: error
    spec:
      serviceAccountName: filebeat
      terminationGracePeriodSeconds: 30
      containers:
      - name: filebeat
        image: docker.elastic.co/beats/filebeat:6.5.1
        args: [
          "-c", "/etc/filebeat.yml",
          "-e",
        ]
        env:
        - name: ELASTICSEARCH_HOST
          value: es.mycompany.com
        - name: ELASTICSEARCH_PORT
          value: "9243"
        - name: ELASTICSEARCH_USERNAME
          value: mycompany
        - name: ELASTICSEARCH_PASSWORD
          value: mycompany
        - name: ELASTICSEARCH_PROTOCOL
          value: https
        securityContext:
          runAsUser: 0
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 100Mi
        volumeMounts:
        - name: config
          mountPath: /etc/filebeat.yml
          readOnly: true
          subPath: filebeat.yml
        - name: inputs
          mountPath: /usr/share/filebeat/inputs.d
          readOnly: true
        - name: data
          mountPath: /usr/share/filebeat/data
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      volumes:
      - name: config
        configMap:
          defaultMode: 0600
          name: filebeat-config
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: inputs
        configMap:
          defaultMode: 0600
          name: filebeat-inputs
      # data folder stores a registry of read status for all files, so we don't send everything again on a Filebeat pod restart
      - name: data
        hostPath:
          path: /var/lib/filebeat-data
          type: DirectoryOrCreate
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: filebeat
subjects:
- kind: ServiceAccount
  name: filebeat
  namespace: kube-system
roleRef:
  kind: ClusterRole
  name: filebeat
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: filebeat
  labels:
    app: filebeat
rules:
- apiGroups: [""] # "" indicates the core API group
  resources:
  - namespaces
  - pods
  verbs:
  - get
  - watch
  - list
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: filebeat
  namespace: kube-system
  labels:
    app: filebeat
---

@sujituk on a first look your configuration seems good. Do you see some of the errors mentioned in this issue in your filebeat logs? If not I don't think this is related.

Could you please open a topic in https://discuss.elastic.co for further investigation and confirming if this is a problem with the configuration or an actual issue in filebeat?

Thanks!

Thanks @jsoriano for taking a look. The filebeat logs doesn't show anything off the ordinary. Created a topic

My first day taking elastic stack for a spin, I keep running into the CRI line error. I'm not using kubernetes, just a docker-compose file:

version: '3.3'
services:

  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:6.6.0
    container_name: elasticsearch
    environment:
      - discovery.type=single-node
    volumes:
      - esdata1:/usr/share/elasticsearch/data
    ports:
      - 9200:9200
    networks:
      - esnet
    labels:
      - "co.elastic.logs/disable=true"

  kibana:
    image: docker.elastic.co/kibana/kibana:6.6.0
    environment:
      SERVER_NAME: kibana
      ELASTICSEARCH_HOSTS: http://elasticsearch:9200
    ports:
      - 5601:5601
    depends_on:
      - elasticsearch
    networks:
      - esnet
    labels:
      - "co.elastic.logs/disable=true"

  filebeat:
    image: docker.elastic.co/beats/filebeat:6.6.0
    command: --strict.perms=false -e
    environment:
      - "setup.kibana.host=kibana:5601"
      - "output.elasticsearch.hosts=elasticsearch:9200"
    volumes:
      - ./filebeat.docker.yml:/usr/share/filebeat/filebeat.yml:ro
      - /var/lib/docker/containers:/var/lib/docker/containers:ro
      - /var/run/docker.sock:/var/run/docker.sock:ro
    labels:
      - "co.elastic.logs/disable=true"
    depends_on:
      - elasticsearch
      - kibana
    networks:
      - esnet

  nginx:
    image: nginx
    ports:
      - 8080:80
    labels:
      - "co.elastic.logs/module=nginx"
      - "co.elastic.logs/fileset.stdout=access"
      - "co.elastic.logs/fileset.stderr=error"
    networks:
      - esnet

volumes:
  esdata1:
    driver: local

networks:
  esnet:

Default filebeat.docker.yml file:

filebeat.config:
  modules:
    path: ${path.config}/modules.d/*.yml
    reload.enabled: false

filebeat.autodiscover:
  providers:
    - type: docker
      hints.enabled: true

processors:
- add_cloud_metadata: ~

output.elasticsearch:
  hosts: '${ELASTICSEARCH_HOSTS:elasticsearch:9200}'
  username: '${ELASTICSEARCH_USERNAME:}'
  password: '${ELASTICSEARCH_PASSWORD:}'

And that is it! I'm using the docker label hints to get things bootstrapped.

I hit the nginx page a couple times to verify logs are ingested. Everything works great at first, but after only a few minutes of activity filebeat throws this in the log:

filebeat_1 | 2019-02-12T21:17:30.428Z INFO log/harvester.go:280 File is inactive: /var/lib/docker/containers/81d7bcbd259eeb68292d8143f69cc4f110be5411e97e8d96da032ebae5800173/81d7bcbd259eeb68292d8143f69cc4f110be5411e97e8d96da032ebae5800173-json.log. Closing because close_inactive of 5m0s reached. filebeat_1 | 2019-02-12T21:17:30.428Z INFO log/harvester.go:280 File is inactive: /var/lib/docker/containers/81d7bcbd259eeb68292d8143f69cc4f110be5411e97e8d96da032ebae5800173/81d7bcbd259eeb68292d8143f69cc4f110be5411e97e8d96da032ebae5800173-json.log. Closing because close_inactive of 5m0s reached. filebeat_1 | 2019-02-12T21:17:35.393Z INFO [monitoring] log/log.go:144 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":160,"time":{"ms":10}},"total":{"ticks":590,"time":{"ms":18},"value":590},"user":{"ticks":430,"time":{"ms":8}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":7},"info":{"ephemeral_id":"b778b65a-c4e0-419f-84c7-7b64132b0418","uptime":{"ms":453027}},"memstats":{"gc_next":4990752,"memory_alloc":2889064,"memory_total":26235152,"rss":-1458176}},"filebeat":{"events":{"added":2,"done":2},"harvester":{"closed":2,"open_files":0,"running":0}},"libbeat":{"config":{"module":{"running":0}},"pipeline":{"clients":2,"events":{"active":0,"filtered":2,"total":2}}},"registrar":{"states":{"current":2,"update":2},"writes":{"success":2,"total":2}},"system":{"load":{"1":0.83,"15":1.63,"5":1.51,"norm":{"1":0.1038,"15":0.2037,"5":0.1888}}}}}} filebeat_1 | 2019-02-12T21:17:35.437Z INFO log/harvester.go:255 Harvester started for file: /var/lib/docker/containers/81d7bcbd259eeb68292d8143f69cc4f110be5411e97e8d96da032ebae5800173/81d7bcbd259eeb68292d8143f69cc4f110be5411e97e8d96da032ebae5800173-json.log filebeat_1 | 2019-02-12T21:17:35.437Z INFO log/harvester.go:255 Harvester started for file: /var/lib/docker/containers/81d7bcbd259eeb68292d8143f69cc4f110be5411e97e8d96da032ebae5800173/81d7bcbd259eeb68292d8143f69cc4f110be5411e97e8d96da032ebae5800173-json.log filebeat_1 | 2019-02-12T21:17:35.437Z ERROR log/harvester.go:282 Read line error: parsing CRI timestamp: parsing time "*2" as "2006-01-02T15:04:05Z07:00": cannot parse "*2" as "2006"; File: /var/lib/docker/containers/81d7bcbd259eeb68292d8143f69cc4f110be5411e97e8d96da032ebae5800173/81d7bcbd259eeb68292d8143f69cc4f110be5411e97e8d96da032ebae5800173-json.log filebeat_1 | 2019-02-12T21:17:35.437Z ERROR log/harvester.go:282 Read line error: parsing CRI timestamp: parsing time "log":"192.168.96.1" as "2006-01-02T15:04:05Z07:00": cannot parse "log":"192.168.96.1" as "2006"; File: /var/lib/docker/containers/81d7bcbd259eeb68292d8143f69cc4f110be5411e97e8d96da032ebae5800173/81d7bcbd259eeb68292d8143f69cc4f110be5411e97e8d96da032ebae5800173-json.log

Not sure what to do next to get things moving forward.

As an update - I rolled back the filebeat image to 6.4.0 and so far have not encountered any errors after inactivity.

@aaronkaz thanks for the report. I started the docker compose scenario you shared and actually I could reproduce the issue, after some time of inactivity these errors start appearing, I am reopening the issue for further investigation.

@jsoriano Why are you going to reopen the issue? @aaronkaz uses images which don't include fix provided by @exekias. Try exekias/filebeat:6.5-snapshot image

@vitkovskii you are right, fix was backported to 6.6 branch but didn't reach 6.6.0.

Upcoming 6.6.1 will include the fix, closing this again.

Facing the same issue with elastic stack version 7.5.1

2020-12-21T07:13:00.894Z ERROR readjson/docker_json.go:201 Parse line error: parsing CRI timestamp: parsing time "{"log":"reading" as "2006-01-02T15:04:05.999999999Z07:00": cannot parse "{"log":"reading" as "2006"
My approach is - filebeats collects log and sends to logstash and from there to elasticsearch.

It was working fine a day ago,in fact, 3/5 filebeat pods are still in running state, however some filebeat pods keep crashing.

Can someone tell me any changes I need to do. I am on Kubernetes. And also have a nginx-controller running for the cluster

@jsoriano

@shishirkhandelwal29 I think this can be related to the issue that https://github.com/elastic/beats/pull/22685 tries to address.

Was this page helpful?
0 / 5 - 0 ratings