Describe the bug
I am consistently getting a lot of these error messages with fluent-bit v1.5.5:
[2020/09/08 22:46:59] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:46:59] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device
[2020/09/08 22:47:03] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:47:04] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:47:18] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:47:19] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:47:33] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:47:34] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:47:48] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:47:49] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:48:03] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:48:04] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:48:18] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:48:19] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:48:33] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:48:34] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:48:48] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:48:49] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:49:03] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:49:04] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:49:18] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/08 22:49:19] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
Once I downgrade back to v1.5.4, I don't see these "Resource temporarily unavailable" errors but I still see "Inappropriate ioctl for device".
While on v1.5.5, I turned on debug mode and got some related logs:
[2020/09/08 22:45:15] [debug] [input:tail:myfile.tail] inode=25194448 events: IN_MODIFY
[2020/09/08 22:45:15] [debug] [input:tail:qa02_uswest2.kubernetes.tail] inode=25194448 events: IN_MODIFY
[2020/09/08 22:45:15] [debug] [task] created task=0x7f6c93ab2540 id=0 without routes, dropping.
[2020/09/08 22:45:15] [debug] [task] destroy task=0x7f6c93ab2540 (task_id=0)
[2020/09/08 22:45:15] [debug] [task] created task=0x7f6c93ab2540 id=0 without routes, dropping.
[2020/09/08 22:45:15] [debug] [task] destroy task=0x7f6c93ab2540 (task_id=0)
[2020/09/08 22:45:15] [debug] [task] created task=0x7f6c93ab2540 id=0 OK
[2020/09/08 22:45:15] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device
[2020/09/08 22:45:15] [debug] [upstream] KA connection #23 is in a failed state to: logging.googleapis.com:443, cleaning up
[2020/09/08 22:45:16] [debug] [output:stackdriver:stackdriver.0] HTTP Status=200
[2020/09/08 22:45:16] [debug] [upstream] KA connection #23 to logging.googleapis.com:443 is now available
Looking at the GCP Cloud Log API stats, I am only sending 20 logs per second.
To Reproduce
[OUTPUT]
Name stackdriver
Match ${STACK}.*
resource global
[OUTPUT]
Name forward
Match noflake
Host ${AGGREGATOR_HOST}
Port ${AGGREGATOR_PORT}
Self_Hostname flb.myhost
tls on
tls.verify off
Expected behavior
Since I am not sending a lot of logs, I don't expect to see any Resource temporarily unavailable type errors especially when this error happens so often. Since switching back to older version of fluent-bit (1.5.4), the error does not show up, either the new version is reveling a hidden problem in my setup or version 1.5.5 has some problem.
Your Environment
Additional context
Fluent-bit version 1.5.5 has TCP timeout fix I need. But running it result in additional errors that I don't expect
@stevenarvar Can you try removing each of the outputs one by one and isolate whether this comes from the forward output or the stackdriver output, or both?
@PettitWesley I disabled all OUTPUT plugins except one stackdriver output and here is what I get:
Fluent Bit v1.5.5
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io
[2020/09/09 22:47:39] [ info] [engine] started (pid=1)
[2020/09/09 22:47:39] [ info] [storage] version=1.0.5, initializing...
[2020/09/09 22:47:39] [ info] [storage] in-memory
[2020/09/09 22:47:39] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2020/09/09 22:47:40] [ info] [filter:kubernetes:kubernetes.2] https=1 host=kubernetes.default.svc port=443
[2020/09/09 22:47:40] [ info] [filter:kubernetes:kubernetes.2] local POD info OK
[2020/09/09 22:47:40] [ info] [filter:kubernetes:kubernetes.2] testing connectivity with API server...
[2020/09/09 22:47:40] [ info] [filter:kubernetes:kubernetes.2] API server connectivity OK
[2020/09/09 22:47:40] [ info] [filter:kubernetes:kubernetes.18] https=1 host=kubernetes.default.svc port=443
[2020/09/09 22:47:40] [ info] [filter:kubernetes:kubernetes.18] local POD info OK
[2020/09/09 22:47:40] [ info] [filter:kubernetes:kubernetes.18] testing connectivity with API server...
[2020/09/09 22:47:40] [ info] [filter:kubernetes:kubernetes.18] API server connectivity OK
[2020/09/09 22:47:40] [ info] [filter:kubernetes:kubernetes.32] https=1 host=kubernetes.default.svc port=443
[2020/09/09 22:47:40] [ info] [filter:kubernetes:kubernetes.32] local POD info OK
[2020/09/09 22:47:40] [ info] [filter:kubernetes:kubernetes.32] testing connectivity with API server...
[2020/09/09 22:47:40] [ info] [filter:kubernetes:kubernetes.32] API server connectivity OK
[2020/09/09 22:47:40] [ info] [oauth2] HTTP Status=200
[2020/09/09 22:47:40] [ info] [oauth2] access token from 'www.googleapis.com:443' retrieved
[2020/09/09 22:47:40] [ info] [http_server] listen iface=0.0.0.0 tcp_port=2020
[2020/09/09 22:47:40] [ info] [sp] stream processor started
[2020/09/09 22:47:40] [ info] inotify_fs_add(): inode=86166998 watch_fd=1 name=/var/log/containers/ ......
.
.
.
.
[2020/09/09 22:48:11] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/09 22:48:11] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device
[2020/09/09 22:48:11] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device
[2020/09/09 22:48:11] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device
[2020/09/09 22:48:11] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device
[2020/09/09 22:48:11] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device
[2020/09/09 22:48:11] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device
[2020/09/09 22:48:11] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device
[2020/09/09 22:48:11] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device
[2020/09/09 22:48:12] [error] [src/flb_upstream.c:356 errno=11] Resource temporarily unavailable
[2020/09/09 22:48:12] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device
[2020/09/09 22:48:12] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device
[2020/09/09 22:48:12] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device
[2020/09/09 22:48:12] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device
[2020/09/09 22:48:12] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device
The above error just keep showing up ever 2 seconds or so. Below is the OUTPUT plugin config:
[OUTPUT]
Name stackdriver
Match ${STACK}.kubernetes.*
resource global
Did a brief experiment with the FORWARD OUTPUT, the same errors are not observed with the FORWARD OUTPUT.
I think the error problem is because of this:
https://github.com/fluent/fluent-bit/blob/1.5/src/flb_upstream.c#L356
@edsiper Should we remove that call to flb_errno(), or is there some way we can make the errno print go to a debug log?
yes, I introduced that one in the latest version accidentally. So the exception is good (because the keepalive connection is in a bad state and cannot be reused), but the error message is noisy and gives the wrong perception
I'm using forward output only and am also seeing (slightly different) error messages with 1.5.5 which point to the same line, and they also go away once I rollback to 1.5.4. It is mostly various combinations of:
[2020/09/10 20:27:03] [error] [src/flb_upstream.c:356 errno=2] No such file or directory
[2020/09/10 20:27:04] [error] [src/flb_upstream.c:356 errno=25] Inappropriate ioctl for device
In debug mode it's similar, KA connection is available then becomes failed state.
I'm commenting here as it looks really similar and to let you know that I see it with forward output also, but please let me know if I should open a new issue instead.
This is fixed in 1a959e79f893455e4c43ae4adf8726b10d44bef7
I will release v1.5.6 tomorrow Friday with this fix on it.
Ran v1.5.6, don't see the errors