Fluent-bit: TLS error connecting to k8s on startup when updating to 1.3.0

Created on 3 Oct 2019  路  12Comments  路  Source: fluent/fluent-bit

Bug Report

Describe the bug
Updating image tag to 1.3.0 using the latest stable helm chart now throws a TLS error on startup

To Reproduce
Install fluent-bit using version 1.2.2 on k8s using latest stable helm chart
Verify everything works as expected using kubernetes filter
Update image tag to 1.3.0 (also taking into account the fix listed in https://github.com/fluent/fluent-bit/issues/1608)
See that the fluent-bit pods now throw a TLS error

[2019/10/03 15:04:31] [error] [io_tls] flb_io_tls.c:165 X509 - Read/write of file failed
[2019/10/03 15:04:31] [error] [TLS] error reading certificates from /var/run/secrets/kubernetes.io/serviceaccount/ca.crt

Expected behavior
Updating to 1.3.0 with same configs as used in 1.2.2 would still work

Screenshots

Your Environment

  • Version used: 1.3.0
  • Configuration: Any using kubernetes filter
  • Environment name and version: k8s 1.11.8
  • Filters and plugins: kubernetes filter
bug fixed

Most helpful comment

I found the root cause of the issue, and surprisedly is not #1313, actually is a bad prototype in the TLS context creation that can lead to uncertain behaviors at runtime.

would you please validate if this image is working properly in your environment?

edsiper/flb-tls-fix:3

All 12 comments

Verified that the ca.crt and token both exist at /var/run/secrets/kubernetes.io/serviceaccount and are readable. Since the helm chart creates these as a configmap based on a secret, they are symlinks however.
drwxr-xr-x 2 root root 100 Oct 3 16:00 ..2019_10_03_16_00_13.071035063 lrwxrwxrwx 1 root root 31 Oct 3 16:00 ..data -> ..2019_10_03_16_00_13.071035063 lrwxrwxrwx 1 root root 13 Oct 3 16:00 ca.crt -> ..data/ca.crt lrwxrwxrwx 1 root root 16 Oct 3 16:00 namespace -> ..data/namespace lrwxrwxrwx 1 root root 12 Oct 3 16:00 token -> ..data/token

Replacing just the 1.3.0 flb_io_tls.c with the 1.2 branch of flb_io_tls.c and rebuilding docker image fixes the issue. A diff of the 1.2.2 vs 1.3.0

121a122
>                                             char *vhost,
135a137
>     ctx->vhost     = vhost;
308a311
>         mbedtls_ssl_close_notify(&session->ssl);
333c336,339
<     mbedtls_ssl_set_hostname(&session->ssl,u->tcp_host);
---
>     if (!u->tls->context->vhost) {
>         u->tls->context->vhost = u->tcp_host;
>     }
>     mbedtls_ssl_set_hostname(&session->ssl, u->tls->context->vhost);

It appears https://github.com/fluent/fluent-bit/pull/1313 somehow broke the TLS connection for the kubernetes filter

thanks for pointing out the issue. I am working in the fix now.

I found the root cause of the issue, and surprisedly is not #1313, actually is a bad prototype in the TLS context creation that can lead to uncertain behaviors at runtime.

would you please validate if this image is working properly in your environment?

edsiper/flb-tls-fix:3

edsiper/flb-tls-fix:3

seems ok for now:

```kubectl logs -f fluent-bit-cwd44 -n logging [f7bf60c]
Fluent Bit v1.3.1
Copyright (C) Treasure Data

[2019/10/04 09:57:51] [ info] [storage] initializing...
[2019/10/04 09:57:51] [ info] [storage] in-memory
[2019/10/04 09:57:51] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2019/10/04 09:57:51] [ info] [engine] started (pid=1)
[2019/10/04 09:57:51] [ info] [in_systemd] seek_cursor=s=0fdc9ccbd5794c1b8297787919010f03;i=89a... OK
[2019/10/04 09:57:51] [ info] [filter_kube] https=1 host=kubernetes.default.svc port=443
[2019/10/04 09:57:51] [ info] [filter_kube] local POD info OK
[2019/10/04 09:57:51] [ info] [filter_kube] testing connectivity with API server...
[2019/10/04 09:57:51] [ info] [filter_kube] API server connectivity OK
[2019/10/04 09:57:51] [ info] [http_server] listen iface=0.0.0.0 tcp_port=2020
[2019/10/04 09:57:51] [ info] [sp] stream processor started

logs looks ok with k8s metadata

thanks for the update.

@gamer22026 can you re-confirm in your end ?

This fix is working for me as well.

thanks!, I am working in the new release.

All good. Thanks for the quick turnaround.

New tags are already available:

official release notes will be out during the day

The official release is out:

https://fluentbit.io/announcements/v1.3.1/

thanks everyone for your help!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

thrift24 picture thrift24  路  4Comments

arienchen picture arienchen  路  3Comments

tarokkk picture tarokkk  路  3Comments

lbogdan picture lbogdan  路  3Comments

c0ze picture c0ze  路  3Comments