Currently first few(not sure if it's only 1 or more) entries slurped up from docker container logs end up without kubernetes metadata.
first few entries look like:
{"@timestamp": "2017-10-11T21:28:20.401Z","@metadata":{"beat":"filebeat","type":"doc","version":"6.0.0-rc1"},"source":"/var/lib/docker/containers/0caeb502469d7f0dfb90c4154294f95f2f9dd206a21cc003f118353d0d6fad75/0caeb502469d7f0dfb90c4154294f95f2f9dd206a21cc003f118353d0d6fad75-json.log","offset":259,"message":"{\"log\":\"2017/10/11 21:28:20.327611 beat.go:430: INFO Home path: [/usr/share/filebeat] Config path: [/usr/share/filebeat] Data path: [/usr/share/filebeat/data] Logs path: [/usr/share/filebeat/logs]\\n\",\"stream\":\"stderr\",\"time\":\"2017-10-11T21:28:20.328149382Z\"}","beat":{"name":"filebeat-filebeat-dwjf3","hostname":"filebeat-filebeat-dwjf3","version":"6.0.0-rc1"}}
Then they are followed by much nicer:
{"@timestamp": "2017-10-11T21:28:55.426Z","@metadata":{"beat":"filebeat","type":"doc","version":"6.0.0-rc1"},"beat":{"name":"filebeat-filebeat-dwjf3","hostname":"filebeat-filebeat-dwjf3","version":"6.0.0-rc1"},"kubernetes":{"container":{"name":"filebeat"},"pod":{"name":"filebeat-filebeat-dwjf3"},"namespace":"default","labels":{"release":"filebeat","app":"filebeat","controller-revision-hash":"4244555777","pod-template-generation":"1"}},"source":"/var/lib/docker/containers/0caeb502469d7f0dfb90c4154294f95f2f9dd206a21cc003f118353d0d6fad75/0caeb502469d7f0dfb90c4154294f95f2f9dd206a21cc003f118353d0d6fad75-json.log","offset":13157,"message":"{\"log\":\"2017/10/11 21:28:50.328022 metrics.go:39: INFO Non-zero metrics in the last 30s: beat.memstats.gc_next=14161552 beat.memstats.memory_alloc=8355272 beat.memstats.memory_total=415714016 filebeat.events.active=4117 filebeat.events.added=90167 filebeat.events.done=86050 filebeat.harvester.open_files=34 filebeat.harvester.running=34 filebeat.harvester.started=34 libbeat.output.events.acked=86016 libbeat.output.events.active=2048 libbeat.output.events.batches=43 libbeat.output.events.total=88064 libbeat.output.type=console libbeat.output.write.bytes=39752840 libbeat.pipeline.clients=1 libbeat.pipeline.events.active=4117 libbeat.pipeline.events.filtered=34 libbeat.pipeline.events.published=90132 libbeat.pipeline.events.total=90167 libbeat.pipeline.queue.acked=86016 registrar.states.current=34 registrar.states.update=86050 registrar.writes=44\\n\",\"stream\":\"stderr\",\"time\":\"2017-10-11T21:28:50.32829028Z\"}"}
In this case, 35seconds passed before kubernetes:{} section started getting injected. This means that alll docker logs shipped during add_kubernetes_metadata are effectively lost to analysis(unless one knows to look for entries without kubernetes metadata)
Would great if we could delay shipping docker logs until kubernetes metadata is ready.
@tarasglek Any chance you could share your filebeat logs?
@vjsamuel Do you also have this issue?
@vjsamuel try this. this filebeat logging itself
problem.tar.gz
The issue is the processor is not instantly ready, takes some time for it to fill its containers state, so first messages may miss metadata. I'll think of ways to fix this, but should not be an issue once autodiscover (https://github.com/elastic/beats/pull/5245) gets in, as autodiscover could be in charge of adding events metadata.
@exekias the podwatcher has to perform a full sync for the processor to startup and return the processor instance to the beat pipeline. hence the processor was initialized with a full sync as per logs as well before for the first file was tailed.
Code:
https://github.com/elastic/beats/blob/master/libbeat/processors/add_kubernetes_metadata/kubernetes.go#L180
https://github.com/elastic/beats/blob/master/libbeat/processors/add_kubernetes_metadata/podwatcher.go#L123
@tarasglek The issue is that, a lot of times, containers start to generate logs even before the container flips into Running state. We add the annotations of a pod into the lookup index only after it gets into Running state. so if the startup logs came in before that, it would end up not having the annotations. But, as @exekias mentioned, look forward to a more clean implementation with auto discovery.
@vjsamuel @exekias Hi! I read various issues in this beats repo and the blog post introducing collectbeat, and still trying to catch things up.
Perhaps as of today, we still need to use an another discovery like collectbeat's dedicated to k8s, if you don't like to drop logs(and its metadata) in the very beginning of a pod?
Would there be anything I can help to upstream the collectbeat's k8s discovery impl(which is so nice. awesome job btw 馃槃) to beats?
@mumoshu k8s discovery is already part of autodiscover.
https://www.elastic.co/guide/en/beats/filebeat/master/configuration-autodiscover.html
https://www.elastic.co/guide/en/beats/metricbeat/master/configuration-autodiscover.html
Thank you @mumoshu, as @vjsamuel said, this code has been merged to master already and will be part of the next 6.3 release. You are more than welcome to give it a try already and submit issues or new features!
I'm closing this as autodiscover allows you to ensure metadata will be present by the moment prospector is launched, have a look to https://www.elastic.co/guide/en/beats/filebeat/current/configuration-autodiscover.html
Sorry for my question, I'm not super familiar with filebeat.
If we switch to use the autodiscover, should we still have the add_kubernetes_metadata processor or just use one or another? Thanks for the help!
Kubernetes autodiscover will automatically add kubernetes metadata for you, it accepts the same parameters as add_kubernetes_metadata :slightly_smiling_face:
Please use our discuss forum for future questions: https://discuss.elastic.co/c/beats
Most helpful comment
@vjsamuel @exekias Hi! I read various issues in this beats repo and the blog post introducing collectbeat, and still trying to catch things up.
Perhaps as of today, we still need to use an another discovery like collectbeat's dedicated to k8s, if you don't like to drop logs(and its metadata) in the very beginning of a pod?
Would there be anything I can help to upstream the collectbeat's k8s discovery impl(which is so nice. awesome job btw 馃槃) to beats?