fluent-bit v1.5.5 outputs resource temporarily unavailable errors

Created on 9 Sep 2020  路  9Comments  路  Source: fluent/fluent-bit

Bug Report

Describe the bug
I am consistently getting a lot of these error messages with fluent-bit v1.5.5:

[2020/09/08 22:46:59] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:46:59] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device
[2020/09/08 22:47:03] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:47:04] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:47:18] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:47:19] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:47:33] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:47:34] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:47:48] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:47:49] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:48:03] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:48:04] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:48:18] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:48:19] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:48:33] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:48:34] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:48:48] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:48:49] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:49:03] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:49:04] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:49:18] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:49:19] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable

Once I downgrade back to v1.5.4, I don't see these "Resource temporarily unavailable" errors but I still see "Inappropriate ioctl for device".

While on v1.5.5, I turned on debug mode and got some related logs:

[2020/09/08 22:45:15] [debug] [input:tail:myfile.tail] inode=25194448 events: IN_MODIFY
[2020/09/08 22:45:15] [debug] [input:tail:qa02_uswest2.kubernetes.tail] inode=25194448 events: IN_MODIFY
[2020/09/08 22:45:15] [debug] [task] created task=0x7f6c93ab2540 id=0 without routes, dropping.
[2020/09/08 22:45:15] [debug] [task] destroy task=0x7f6c93ab2540 (task_id=0)
[2020/09/08 22:45:15] [debug] [task] created task=0x7f6c93ab2540 id=0 without routes, dropping.
[2020/09/08 22:45:15] [debug] [task] destroy task=0x7f6c93ab2540 (task_id=0)
[2020/09/08 22:45:15] [debug] [task] created task=0x7f6c93ab2540 id=0 OK
[2020/09/08 22:45:15] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device
[2020/09/08 22:45:15] [debug] [upstream] KA connection #23 is in a failed state to: logging.googleapis.com:443, cleaning up
[2020/09/08 22:45:16] [debug] [output:stackdriver:stackdriver.0] HTTP Status=200
[2020/09/08 22:45:16] [debug] [upstream] KA connection #23 to logging.googleapis.com:443 is now available
Looking at the GCP Cloud Log API stats, I am only sending 20 logs per second.

To Reproduce

  • Steps to reproduce the problem:
    Configure fluent-bit to forward logs using stackdriver output plugin and forward plugin to an aggregator like so:
[OUTPUT]
    Name  stackdriver
    Match ${STACK}.*
    resource global

[OUTPUT]
    Name          forward
    Match         noflake
    Host          ${AGGREGATOR_HOST}
    Port          ${AGGREGATOR_PORT}
    Self_Hostname flb.myhost
    tls           on
    tls.verify    off

Expected behavior
Since I am not sending a lot of logs, I don't expect to see any Resource temporarily unavailable type errors especially when this error happens so often. Since switching back to older version of fluent-bit (1.5.4), the error does not show up, either the new version is reveling a hidden problem in my setup or version 1.5.5 has some problem.

Your Environment

  • Version used: Fluent-bit 1.5.5
  • Configuration: Stackdriver and fluentd aggregators
  • Environment name and version (e.g. Kubernetes? What version?): K8S 1.16
  • Server type and version: K8S pod on AWS EC2
  • Operating System and version:
  • Filters and plugins: stackdriver and forward output pulgin

Additional context
Fluent-bit version 1.5.5 has TCP timeout fix I need. But running it result in additional errors that I don't expect

bug fixed

All 9 comments

@stevenarvar Can you try removing each of the outputs one by one and isolate whether this comes from the forward output or the stackdriver output, or both?

@PettitWesley I disabled all OUTPUT plugins except one stackdriver output and here is what I get:

Fluent Bit v1.5.5
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2020/09/09 22:47:39] [ info] [engine] started (pid=1)
[2020/09/09 22:47:39] [ info] [storage] version=1.0.5, initializing...
[2020/09/09 22:47:39] [ info] [storage] in-memory
[2020/09/09 22:47:39] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2020/09/09 22:47:40] [ info] [filter:kubernetes:kubernetes.2] https=1 host=kubernetes.default.svc port=443
[2020/09/09 22:47:40] [ info] [filter:kubernetes:kubernetes.2] local POD info OK
[2020/09/09 22:47:40] [ info] [filter:kubernetes:kubernetes.2] testing connectivity with API server...
[2020/09/09 22:47:40] [ info] [filter:kubernetes:kubernetes.2] API server connectivity OK
[2020/09/09 22:47:40] [ info] [filter:kubernetes:kubernetes.18] https=1 host=kubernetes.default.svc port=443
[2020/09/09 22:47:40] [ info] [filter:kubernetes:kubernetes.18] local POD info OK
[2020/09/09 22:47:40] [ info] [filter:kubernetes:kubernetes.18] testing connectivity with API server...
[2020/09/09 22:47:40] [ info] [filter:kubernetes:kubernetes.18] API server connectivity OK
[2020/09/09 22:47:40] [ info] [filter:kubernetes:kubernetes.32] https=1 host=kubernetes.default.svc port=443
[2020/09/09 22:47:40] [ info] [filter:kubernetes:kubernetes.32] local POD info OK
[2020/09/09 22:47:40] [ info] [filter:kubernetes:kubernetes.32] testing connectivity with API server...
[2020/09/09 22:47:40] [ info] [filter:kubernetes:kubernetes.32] API server connectivity OK
[2020/09/09 22:47:40] [ info] [oauth2] HTTP Status=200
[2020/09/09 22:47:40] [ info] [oauth2] access token from 'www.googleapis.com:443' retrieved
[2020/09/09 22:47:40] [ info] [http_server] listen iface=0.0.0.0 tcp_port=2020
[2020/09/09 22:47:40] [ info] [sp] stream processor started
[2020/09/09 22:47:40] [ info] inotify_fs_add(): inode=86166998 watch_fd=1 name=/var/log/containers/ ......
.
.
.
.
[2020/09/09 22:48:11] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/09 22:48:11] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device
[2020/09/09 22:48:11] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device
[2020/09/09 22:48:11] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device
[2020/09/09 22:48:11] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device
[2020/09/09 22:48:11] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device
[2020/09/09 22:48:11] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device
[2020/09/09 22:48:11] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device
[2020/09/09 22:48:11] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device
[2020/09/09 22:48:12] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/09 22:48:12] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device
[2020/09/09 22:48:12] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device
[2020/09/09 22:48:12] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device
[2020/09/09 22:48:12] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device
[2020/09/09 22:48:12] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device

The above error just keep showing up ever 2 seconds or so. Below is the OUTPUT plugin config:

[OUTPUT]
    Name  stackdriver
    Match ${STACK}.kubernetes.*
    resource global

Did a brief experiment with the FORWARD OUTPUT, the same errors are not observed with the FORWARD OUTPUT.

I think the error problem is because of this:

https://github.com/fluent/fluent-bit/blob/1.5/src/flb_upstream.c#L356

@edsiper Should we remove that call to flb_errno(), or is there some way we can make the errno print go to a debug log?

yes, I introduced that one in the latest version accidentally. So the exception is good (because the keepalive connection is in a bad state and cannot be reused), but the error message is noisy and gives the wrong perception

I'm using forward output only and am also seeing (slightly different) error messages with 1.5.5 which point to the same line, and they also go away once I rollback to 1.5.4. It is mostly various combinations of:

[2020/09/10 20:27:03] [error] [src/flb_upstream.c:356 errno=2] No such file or directory
[2020/09/10 20:27:04] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device

In debug mode it's similar, KA connection is available then becomes failed state.

I'm commenting here as it looks really similar and to let you know that I see it with forward output also, but please let me know if I should open a new issue instead.

This is fixed in 1a959e79f893455e4c43ae4adf8726b10d44bef7

I will release v1.5.6 tomorrow Friday with this fix on it.

Ran v1.5.6, don't see the errors

Was this page helpful?
0 / 5 - 0 ratings