Beats: Add pod-uid support for add_kubernetes_metadata matchers

Created on 7 May 2018  路  12Comments  路  Source: elastic/beats

Hi everyone,

I am opening this enhancement request as recommended here.

Our use case is the following:

In our current scenario we have a Kubernetes cluster with several pods on different nodes, using Filebeat to stream the logs to an Elasticsearch host. The applications write to a few different file logs and, because of this and some other reasons, we are not using the stdout and stderr outputs for Filebeat. Instead, we have created and mounted volumes on the host file system and mounted these volumes on the Filebeat pod so it can read and send them.
This works fine except for the fact that all events appear as originating from the Filebeat pod and we have lost all the Kubernetes metadata that normally gets appended.

The pods log volumes are mounted in:
/var/lib/kubelet/pods/<pod-uid>/volumes/kubernetes.io~empty-dir/<volume-name>
And thus the source field of all our events look like this:
/var/lib/kubelet/pods/005f3b90-4b9d-12f8-acf0-31020a840133/volumes/kubernetes.io~empty-dir/applogs/server.log

The idea is to be able to use the pod-uid as a starting point to obtain Kubernetes metadata for beats, in addition to the existing container-id starting point. I am particularly interested in having this for Filebeat, although depending on the approach taken in could end up being available for all the beats.

Basically, I believe there would be two general approaches to build this:

  • Extend the general add_kubernetes_metadata so that it can find K8s metadata using the pod-uid
  • Modify Filebeat by adding a new type of Matcher or by mapping the pod-uid to a container-uid before this mechanism kicks in.

A caveat is that by only focusing on the pod-uid we could be losing the info about the originating container.

Any thoughts or recommendations about the proposal and the approaches?
Cheers!

containers enhancement

Most helpful comment

Hello se帽or!

I updated the issue with some info from the original post, let me know if it looks good like this.
Regarding the container id, unfortunately we don't have it, because of the way we are mounting the pods' volumes.

I understand that the preferred approach would be to make this available for the whole Beats family. Going to start looking into it and, probably, will be requesting here some support/guidance when I hit a wall. It is going to be my first project in Go, so please bear with me :)

All 12 comments

This could be interesting, could you explain a little more about your use case? I guess you don't have the container id available to do the match. In general that would be preferred, as with the pod uid you miss the info about the source container.

In any case, this may be useful so I think it's good we keep track of interest here.

As for the approach, I agree with most of it, we would need to:

  • Extend the general add_kubernetes_metadata so that it can find K8s metadata using the kubernetes.pod.uid
  • Add a new indexer to index by pod uid, then use field matcher for instance

Hello se帽or!

I updated the issue with some info from the original post, let me know if it looks good like this.
Regarding the container id, unfortunately we don't have it, because of the way we are mounting the pods' volumes.

I understand that the preferred approach would be to make this available for the whole Beats family. Going to start looking into it and, probably, will be requesting here some support/guidance when I hit a wall. It is going to be my first project in Go, so please bear with me :)

Nice! don't hesitate to ask for help at any step, also you can open a pull request to gather input as soon as you have something

@mariomechoulam I have same idea like yours, looking forward to your RP.

Alright, I believe I have a basic understanding on how everything is glued together inside libbeat and how specific beats add additional Matcher and Indexer behaviours. Now a couple questions :)

I can see that the kubernetes.pod.uid is already present in the ObjectMeta structure of a Pod when it gets added/updated by the watcher. By creating a new PodUidIndexer I would still be using the kubernetes.MetaGenerator to access and cache the PodMetadata that is already cached in the PodNameIndexer, only adding the pod.uid field.
Would it make sense to enhance the existing Indexer and offer the pod uid metadata out of the box instead?

In the event, the pod uid is only present inside the source field. By using a FieldMatcher and specifying

matchers:
      - fields:
          lookup_fields: ["source"]

I would be getting the whole log path, e.g.
/var/lib/kubelet/pods/005f3b90-4b9d-12f8-acf0-31020a840133/volumes/kubernetes.io~empty-dir/applogs/server.log

Do the lookup_fields support regular expressions? The docs do not specifically state it.

Hope I am on the right track!

@mariomechoulam I found a project called log-pilot, base on filebeat, already using emptydir to collect application logs, here is the document but it's in Chinese, https://www.alibabacloud.com/help/zh/doc-detail/68264.htm

Hi @yu-yang2 , yes I saw that one while I was on the lookout for a proper way to do this.
Still, pulling an alternative Filebeat image was not something that I was looking into; that is the main reason why I decided to give it a go here. But if it works for you, go for it :)

Your reasoning is looking good @mariomechoulam, I'm on the fence about including pod.uid under default metadata. We can make it optional through a config parameter though.

I saw your PR, we can move the conversation there

@mariomechoulam @exekias Can you tell me how to use add_kubernetes_metadata to add k8s pod ip to log metadata? I really need this metadata. I found - add_host_metadata processors can only add host ip to log metadata, and may cause errors:

fatal error: concurrent map iteration and map write

goroutine 58 [running]:
runtime.throw(0x145b89d, 0x26)
    /usr/local/go/src/runtime/panic.go:616 +0x81 fp=0xc420cc96b0 sp=0xc420cc9690 pc=0x8f9d81
runtime.mapiternext(0xc420cc9778)
    /usr/local/go/src/runtime/hashmap.go:747 +0x55c fp=0xc420cc9740 sp=0xc420cc96b0 pc=0x8d7d4c
github.com/elastic/beats/vendor/github.com/elastic/go-structform/gotype.foldMapInterface(0xc4203fd3e0, 0x12d6440, 0xc420560420, 0xc420560420, 0x1487120)
    /root/go/src/github.com/elastic/beats/vendor/github.com/elastic/go-structform/gotype/fold_map.go:34 +0xea fp=0xc420cc97e8 sp=0xc420cc9740 pc=0xce6e7a
github.com/elastic/beats/vendor/github.com/elastic/go-structform/gotype.foldInterfaceValue(0xc4203fd3e0, 0x13c5f60, 0xc420560420, 0x0, 0x0)
    /root/go/src/github.com/elastic/beats/vendor/github.com/elastic/go-structform/gotype/fold.go:89 +0x15b fp=0xc420cc9860 sp=0xc420cc97e8 pc=0xce4f0b
github.com/elastic/beats/vendor/github.com/elastic/go-structform/gotype.foldMapInlineInterface(0xc4203fd3e0, 0x13c5f60, 0xc42054a358, 0x95, 0x13c5f60, 0xc42054a358)
    /root/go/src/github.com/elastic/beats/vendor/github.com/elastic/go-structform/gotype/fold_map_inline.generated.go:44 +0x153 fp=0xc420cc9918 sp=0xc420cc9860 pc=0xce7a03
github.com/elastic/beats/vendor/github.com/elastic/go-structform/gotype.makeFieldInlineFold.func1(0xc4203fd3e0, 0x1337760, 0xc42054a340, 0x99, 0x0, 0x0)
    /root/go/src/github.com/elastic/beats/vendor/github.com/elastic/go-structform/gotype/fold_reflect.go:292 +0x86 fp=0xc420cc9968 sp=0xc420cc9918 pc=0xd3b0d6
github.com/elastic/beats/vendor/github.com/elastic/go-structform/gotype.makeFieldsFold.func1(0xc4203fd3e0, 0x1337760, 0xc42054a340, 0x99, 0x0, 0x1337760)
    /root/go/src/github.com/elastic/beats/vendor/github.com/elastic/go-structform/gotype/fold_reflect.go:177 +0x88 fp=0xc420cc99c0 sp=0xc420cc9968 pc=0xd3aef8
github.com/elastic/beats/vendor/github.com/elastic/go-structform/gotype.makeStructFold.func1(0xc4203fd3e0, 0x1337760, 0xc42054a340, 0x99, 0x0, 0x0)
    /root/go/src/github.com/elastic/beats/vendor/github.com/elastic/go-structform/gotype/fold_reflect.go:167 +0x95 fp=0xc420cc9a08 sp=0xc420cc99c0 pc=0xd3adf5
github.com/elastic/beats/vendor/github.com/elastic/go-structform/gotype.foldAnyReflect(0xc4203fd3e0, 0x1337760, 0xc42054a340, 0x99, 0x99, 0xc42008e800)
    /root/go/src/github.com/elastic/beats/vendor/github.com/elastic/go-structform/gotype/fold_reflect.go:511 +0xb9 fp=0xc420cc9a48 sp=0xc420cc9a08 pc=0xcecb59
github.com/elastic/beats/vendor/github.com/elastic/go-structform/gotype.foldInterfaceValue(0xc4203fd3e0, 0x1337760, 0xc42054a340, 0xc42054a340, 0xc42054a340)
    /root/go/src/github.com/elastic/beats/vendor/github.com/elastic/go-structform/gotype/fold.go:92 +0x1c9 fp=0xc420cc9ac0 sp=0xc420cc9a48 pc=0xce4f79
github.com/elastic/beats/vendor/github.com/elastic/go-structform/gotype.(*Iterator).Fold(0xc4203fd3e0, 0x1337760, 0xc42054a340, 0xc42054a340, 0x0)
    /root/go/src/github.com/elastic/beats/vendor/github.com/elastic/go-structform/gotype/fold.go:69 +0x41 fp=0xc420cc9af8 sp=0xc420cc9ac0 pc=0xce4d81
github.com/elastic/beats/libbeat/outputs/elasticsearch.(*jsonEncoder).AddRaw(0xc42046dac0, 0x131c640, 0xc42131e2c0, 0x0, 0x0)
    /root/go/src/github.com/elastic/beats/libbeat/outputs/elasticsearch/enc.go:96 +0x300 fp=0xc420cc9bf8 sp=0xc420cc9af8 pc=0xd68be0
github.com/elastic/beats/libbeat/outputs/elasticsearch.(*jsonEncoder).Add(0xc42046dac0, 0x130b9c0, 0xc421662380, 0x131c640, 0xc42131e2c0, 0xc421662380, 0x0)
    /root/go/src/github.com/elastic/beats/libbeat/outputs/elasticsearch/enc.go:116 +0x8b fp=0xc420cc9c58 sp=0xc420cc9bf8 pc=0xd68c9b
github.com/elastic/beats/libbeat/outputs/elasticsearch.bulkEncodePublishRequest(0x7ffbb00e6940, 0xc42046dac0, 0x14f7c60, 0xc420470800, 0x0, 0xc42131df80, 0x32, 0x382, 0xc421518f60, 0xc42060a900, ...)
    /root/go/src/github.com/elastic/beats/libbeat/outputs/elasticsearch/client.go:355 +0x1ac fp=0xc420cc9d58 sp=0xc420cc9c58 pc=0xd62d1c
github.com/elastic/beats/libbeat/outputs/elasticsearch.(*Client).publishEvents(0xc420413760, 0xc42131df80, 0x32, 0x382, 0x0, 0x0, 0x0, 0x0, 0x0)
    /root/go/src/github.com/elastic/beats/libbeat/outputs/elasticsearch/client.go:286 +0x14e fp=0xc420cc9e98 sp=0xc420cc9d58 pc=0xd624ae
github.com/elastic/beats/libbeat/outputs/elasticsearch.(*Client).Publish(0xc420413760, 0x1510160, 0xc421095140, 0xc42008cfc0, 0xc420cc9f78)
    /root/go/src/github.com/elastic/beats/libbeat/outputs/elasticsearch/client.go:253 +0x43 fp=0xc420cc9f00 sp=0xc420cc9e98 pc=0xd622c3
github.com/elastic/beats/libbeat/outputs.(*backoffClient).Publish(0xc42040dc20, 0x1510160, 0xc421095140, 0x0, 0x0)
    /root/go/src/github.com/elastic/beats/libbeat/outputs/backoff.go:43 +0x4b fp=0xc420cc9f48 sp=0xc420cc9f00 pc=0xcb6c3b
github.com/elastic/beats/libbeat/publisher/pipeline.(*netClientWorker).run(0xc420470bc0)
    /root/go/src/github.com/elastic/beats/libbeat/publisher/pipeline/output.go:90 +0x1a9 fp=0xc420cc9fd8 sp=0xc420cc9f48 pc=0xd91339
runtime.goexit()
    /usr/local/go/src/runtime/asm_amd64.s:2361 +0x1 fp=0xc420cc9fe0 sp=0xc420cc9fd8 pc=0x929b61
created by github.com/elastic/beats/libbeat/publisher/pipeline.makeClientWorker
    /root/go/src/github.com/elastic/beats/libbeat/publisher/pipeline/output.go:31 +0xf0

Hi @yu-yang2
Inside /libbeat/common/kubernetes/types.go you have the PodStatus struct which contains the PodIP. It would be a matter of doing something very similar to what I did, to add that field to the K8s metadata through a configuration option..
My suggestion is to open a new issue/PR for it and I can keep track of it in case you get blocked.
Cheers!

@mariomechoulam, It's very kind of you, I will try first and open a new issue if I meet some trouble馃槵

Closed by #7072

Was this page helpful?
0 / 5 - 0 ratings

Related issues

marian-craciunescu picture marian-craciunescu  路  3Comments

musayev-io picture musayev-io  路  3Comments

EndlessTundra picture EndlessTundra  路  3Comments

feelan03 picture feelan03  路  3Comments

dedemorton picture dedemorton  路  3Comments