Describe the bug
forward output is using multiple TCP (HTTP/TLS) connections to fluentd.
To Reproduce
fluent-bit with the following configuration:# ...
[OUTPUT]
Name forward
Match *
Host fluentd_hostname
Port fluentd_port
tls On
tls.verify On
tls.ca_file /fluent-bit/ssl/ca.crt.pem
tls.crt_file /fluent-bit/ssl/client.crt.pem
tls.key_file /fluent-bit/ssl/client.key.pem
Shared_Key [redacted]
fluentd logs:fluentd_1 | 2020-07-21 13:38:55 +0000 [debug]: #0 generating helo
fluentd_1 | 2020-07-21 13:38:56 +0000 [debug]: #0 generating helo
fluentd_1 | 2020-07-21 13:38:56 +0000 [debug]: #0 generating helo
fluentd_1 | 2020-07-21 13:38:56 +0000 [debug]: #0 generating helo
fluentd_1 | 2020-07-21 13:38:56 +0000 [debug]: #0 generating helo
fluentd_1 | 2020-07-21 13:38:56 +0000 [debug]: #0 generating helo
fluentd_1 | 2020-07-21 13:38:56 +0000 [debug]: #0 checking ping
fluentd_1 | 2020-07-21 13:38:56 +0000 [debug]: #0 generating pong
fluentd_1 | 2020-07-21 13:38:56 +0000 [debug]: #0 connection established address="fluent-bit_ip" port=50057
fluentd_1 | 2020-07-21 13:38:56 +0000 [debug]: #0 checking ping
fluentd_1 | 2020-07-21 13:38:56 +0000 [debug]: #0 generating pong
fluentd_1 | 2020-07-21 13:38:56 +0000 [debug]: #0 connection established address="fluent-bit_ip" port=55213
fluentd_1 | 2020-07-21 13:38:56 +0000 [debug]: #0 checking ping
fluentd_1 | 2020-07-21 13:38:56 +0000 [debug]: #0 generating pong
fluentd_1 | 2020-07-21 13:38:56 +0000 [debug]: #0 connection established address="fluent-bit_ip" port=25835
fluentd_1 | 2020-07-21 13:38:56 +0000 [debug]: #0 checking ping
fluentd_1 | 2020-07-21 13:38:56 +0000 [debug]: #0 generating pong
fluentd_1 | 2020-07-21 13:38:56 +0000 [debug]: #0 connection established address="fluent-bit_ip" port=64334
fluentd_1 | 2020-07-21 13:38:56 +0000 [debug]: #0 checking ping
fluentd_1 | 2020-07-21 13:38:56 +0000 [debug]: #0 generating pong
fluentd_1 | 2020-07-21 13:38:56 +0000 [debug]: #0 connection established address="fluent-bit_ip" port=51753
fluentd_1 | 2020-07-21 13:38:56 +0000 [debug]: #0 checking ping
fluentd_1 | 2020-07-21 13:38:56 +0000 [debug]: #0 generating pong
fluentd_1 | 2020-07-21 13:38:56 +0000 [debug]: #0 connection established address="fluent-bit_ip" port=3652
fluentd side, check that the connections are indeed established:$ netstat
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 fluentd_ip:24231 fluent-bit_ip:51753 ESTABLISHED
tcp 0 0 fluentd_ip:24231 fluent-bit_ip:25835 ESTABLISHED
tcp 0 0 fluentd_ip:24231 fluent-bit_ip:36520 ESTABLISHED
tcp 0 0 fluentd_ip:24231 fluent-bit_ip:64334 ESTABLISHED
tcp 0 0 fluentd_ip:24231 fluent-bit_ip:55213 ESTABLISHED
tcp 0 0 fluentd_ip:24231 fluent-bit_ip:50057 ESTABLISHED
Expected behavior
When using the forward output, fluent-bit uses a single TCP connection to fluentd.
Your Environment
Is this expected behavior? And if so, can fluent-bit be configured to only use one connection?
At the moment that option is not available. By default, Fluent Bit v1.5 always uses a KeepAlive connection is limited to 30 seconds if idle:
if you want Fluent Bit to perform a new TCP/TLS connection upon flush time, you can disable keepalive mode with net.keepalive false in your output configuration section.
We will introduce shortly the option to limit the maximum number of keepalive connections.
The networking / keepalive documentation doesn't explicitly say that if keepalive is enabled (which is also the default) this implies multiple connections. On the contrary, wording suggests it's a single connection:
The concept of
TCP Keepaliverefers to the ability of the client (Fluent Bit on this case) to keep the TCP connection open in a persistent way, that means that once the connection is created and used, instead of close it, it can be recycled.
Out of curiosity, what was the technical reasoning behind multiple connections?
More to lbogdan's point : the example given in the docs for v1.5 (https://docs.fluentbit.io/manual/administration/networking#example) doesn't work as advertised.
I've tested that exact config (from the networking doc page) on localhost on a RHEL 7.6 with version td-agent-bit v1.5.4 and the TCP connection is closed by the source (nc -l dies) after 1 message (with that config) unlike what is shown in the output in the docs.
Testing with nc --keep-open -l and more verbosity shows that a new TCP is brought up at every flush interval and the "previous" TCPs never get to be reused (they don't even get to live until the net.keepalive_idle_timeout period).
While the rational in it is very sound (and a desirable goal as this ticket also suggests), it doesn't look like https://github.com/fluent/fluent-bit/issues/1704 works.
Most helpful comment
More to lbogdan's point : the example given in the docs for v1.5 (https://docs.fluentbit.io/manual/administration/networking#example) doesn't work as advertised.
I've tested that exact config (from the networking doc page) on localhost on a RHEL 7.6 with version td-agent-bit v1.5.4 and the TCP connection is closed by the source (
nc -ldies) after 1 message (with that config) unlike what is shown in the output in the docs.Testing with
nc --keep-open -land more verbosity shows that a new TCP is brought up at every flush interval and the "previous" TCPs never get to be reused (they don't even get to live until thenet.keepalive_idle_timeoutperiod).While the rational in it is very sound (and a desirable goal as this ticket also suggests), it doesn't look like https://github.com/fluent/fluent-bit/issues/1704 works.