haproxy -vv and uname -aHA-Proxy version 2.2.0 2020/07/07 - https://haproxy.org/
Status: long-term supported branch - will stop receiving fixes around Q2 2025.
Known bugs: http://www.haproxy.org/bugs/bugs-2.2.0.html
Running on: Linux 4.15.0-42-generic #45-Ubuntu SMP Thu Nov 15 19:32:57 UTC 2018 x86_64
Build options :
TARGET = linux-glibc
CPU = generic
CC = gcc
CFLAGS = -O2 -g -Wall -Wextra -Wdeclaration-after-statement -fwrapv -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-clobbered -Wno-missing-field-initializers -Wno-stringop-overflow -Wtype-limits -Wshift-negative-value -Wshift-overflow=2 -Wduplicated-cond -Wnull-dereference
OPTIONS = USE_PCRE=1 USE_LINUX_TPROXY=1 USE_LINUX_SPLICE=1 USE_LIBCRYPT=1 USE_OPENSSL=1 USE_ZLIB=1 USE_SYSTEMD=1
Feature list : +EPOLL -KQUEUE +NETFILTER +PCRE -PCRE_JIT -PCRE2 -PCRE2_JIT +POLL -PRIVATE_CACHE +THREAD -PTHREAD_PSHARED +BACKTRACE -STATIC_PCRE -STATIC_PCRE2 +TPROXY +LINUX_TPROXY +LINUX_SPLICE +LIBCRYPT +CRYPT_H +GETADDRINFO +OPENSSL -LUA +FUTEX +ACCEPT4 +ZLIB -SLZ +CPU_AFFINITY +TFO +NS +DL +RT -DEVICEATLAS -51DEGREES -WURFL +SYSTEMD -OBSOLETE_LINKER +PRCTL +THREAD_DUMP -EVPORTS
Default settings :
bufsize = 16384, maxrewrite = 1024, maxpollevents = 200
Built with multi-threading support (MAX_THREADS=64, default=16).
Built with OpenSSL version : OpenSSL 1.1.0g 2 Nov 2017
Running on OpenSSL version : OpenSSL 1.1.0g 2 Nov 2017
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2
Built with network namespace support.
Built with zlib version : 1.2.11
Running on zlib version : 1.2.11
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Built with PCRE version : 8.39 2016-06-14
Running on PCRE version : 8.39 2016-06-14
PCRE library supports JIT : no (USE_PCRE_JIT not set)
Encrypted password support via crypt(3): yes
Built with gcc compiler version 7.3.0
Built with the Prometheus exporter as a service
Available polling systems :
epoll : pref=300, test result OK
poll : pref=200, test result OK
select : pref=150, test result OK
Total: 3 (3 usable), will use epoll.
Available multiplexer protocols :
(protocols marked as <default> cannot be specified using 'proto' keyword)
fcgi : mode=HTTP side=BE mux=FCGI
<default> : mode=HTTP side=FE|BE mux=H1
h2 : mode=HTTP side=FE|BE mux=H2
<default> : mode=TCP side=FE|BE mux=PASS
Available services :
prometheus-exporter
Available filters :
[SPOE] spoe
[COMP] compression
[TRACE] trace
[CACHE] cache
[FCGI] fcgi-app
# uname -a
Linux hostname-xxx 4.15.0-42-generic #45-Ubuntu SMP Thu Nov 15 19:32:57 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
global
user haproxy
group haproxy
nbproc 1
nbthread 16
cpu-map auto:1/1-16 0-15
log /dev/log local2
log /dev/log local0 notice
chroot /path/to/haproxy
pidfile /var/run/haproxy.pid
daemon
master-worker
maxconn 200000
hard-stop-after 1h
stats socket /path/to/haproxy/socket mode 660 level admin expose-fd listeners
tune.ssl.cachesize 3000000
tune.ssl.lifetime 60000
ssl-default-bind-ciphers ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256
ssl-default-bind-options ssl-min-ver TLSv1.2 ssl-max-ver TLSv1.2
server-state-file /path/to/haproxy/haproxy_server_states
tune.bufsize 40960
defaults
mode http
log global
retries 3
timeout http-request 10s
timeout queue 10s
timeout connect 10s
timeout client 1m
timeout server 1m
timeout tunnel 10m
timeout client-fin 30s
timeout server-fin 30s
timeout check 10s
option httplog
option forwardfor except 127.0.0.0/8
option redispatch
load-server-state-from-file global
we use haproxy in our environment and we recently upgraded to 2.1.0 from 2.0.
Now everytime we reload, the number of FD is increasing by 1. This was not the case in version 2.0.
The fd limit for root user is set to 1024, after 1024 reloads, the haproxy process starts to fail and the reload works no more.
new changes are not being applied.
This happens in haproxy 2.2 as well.
to give a background, we reload the haproxy very frequently so we hit the fd limit real soon and our services starts to fail.
This issue does not happen if we do not read the server-state during reload.
Below is the experiment, and it shows the fd count increase.
root@haproxynode:/usr/local/src/haproxy-2.2.0# ls -l /proc/$(systemctl status haproxy |grep 'Main PID' |awk '{print $3}')/fd
total 0
lr-x------ 1 root root 64 Jul 9 09:11 0 -> /dev/null
lrwx------ 1 root root 64 Jul 9 09:11 1 -> 'socket:[254090076]'
lrwx------ 1 root root 64 Jul 9 09:11 10 -> 'socket:[254090089]'
lrwx------ 1 root root 64 Jul 9 09:11 2 -> 'socket:[254090076]'
lrwx------ 1 root root 64 Jul 9 09:11 3 -> 'socket:[254090081]'
lrwx------ 1 root root 64 Jul 9 09:11 4 -> 'anon_inode:[eventpoll]'
lr-x------ 1 root root 64 Jul 9 09:11 5 -> /path/to/haproxy/server-state
lr-x------ 1 root root 64 Jul 9 09:11 6 -> 'pipe:[254090092]'
l-wx------ 1 root root 64 Jul 9 09:11 7 -> 'pipe:[254090092]'
root@haproxynode:/usr/local/src/haproxy-2.2.0# systemctl reload haproxy
root@haproxynode:/usr/local/src/haproxy-2.2.0# ls -l /proc/$(systemctl status haproxy |grep 'Main PID' |awk '{print $3}')/fd
total 0
lr-x------ 1 root root 64 Jul 9 09:11 0 -> /dev/null
lrwx------ 1 root root 64 Jul 9 09:11 1 -> 'socket:[254090076]'
l-wx------ 1 root root 64 Jul 9 09:11 10 -> 'pipe:[254090150]'
lrwx------ 1 root root 64 Jul 9 09:11 2 -> 'socket:[254090076]'
lrwx------ 1 root root 64 Jul 9 09:11 4 -> 'socket:[254090139]'
lr-x------ 1 root root 64 Jul 9 09:11 5 -> /path/to/haproxy/server-state
lrwx------ 1 root root 64 Jul 9 09:11 6 -> 'anon_inode:[eventpoll]'
lr-x------ 1 root root 64 Jul 9 09:11 7 -> /path/to/haproxy/server-state
lr-x------ 1 root root 64 Jul 9 09:11 8 -> 'pipe:[254090150]'
lrwx------ 1 root root 64 Jul 9 09:11 9 -> 'socket:[254090148]'
The number of fd reading the server-state increased to 2. Is this a known bug? or do we have to do something different in newer versions?
Reload should not make the fd number to increase in every reload and keep them idle forever.
No, I know reloads are causing this, not sure about the internals.
No
da29fe2360e61a5ed4acd283765b20addd5a3ea8
$ g bisect good
da29fe2360e61a5ed4acd283765b20addd5a3ea8 is the first bad commit
commit da29fe2360e61a5ed4acd283765b20addd5a3ea8
Author: Baptiste Assmann <[email protected]>
Date: Thu Jun 13 13:24:29 2019 +0200
MEDIUM: server: server-state global file stored in a tree
Server states can be recovered from either a "global" file (all backends)
or a "local" file (per backend).
The way the algorithm to parse the state file was first implemented was good
enough for a low number of backends and servers per backend.
Basically, for each backend the state file (global or local) is opened,
parsed entirely and for each line we check if it contains data related to
a server from the backend we're currently processing.
We must read the file entirely, just in case some lines for the current
backend are stored at the end of the file.
This does not scale at all!
This patch changes the behavior above for the "global" file only. Now,
the global file is read and parsed once and all lines it contains are
stored in a tree, for faster discovery.
This result in way much less fopen, fgets, and strcmp calls, which make
loading of very big state files very quick now.
include/types/server.h | 11 ++
src/server.c | 412 +++++++++++++++++++++++++++++++++----------------
2 files changed, 294 insertions(+), 129 deletions(-)
$ g bisect log
git bisect start
# bad: [5254321d1447bc72a22f0381a0225175d42e6704] BUILD: tcp: condition TCP keepalive settings to platforms providing them
git bisect bad 5254321d1447bc72a22f0381a0225175d42e6704
# bad: [32bf97fb6048e0fb7afe8c336e6a1594fbde9430] [RELEASE] Released version 2.2-dev3
git bisect bad 32bf97fb6048e0fb7afe8c336e6a1594fbde9430
# good: [9dc6b97429ce0f5be142fa9b920bf0ef0a714d73] [RELEASE] Released version 2.1-dev0
git bisect good 9dc6b97429ce0f5be142fa9b920bf0ef0a714d73
# bad: [34779c34fcb483b91339a1c4c8d74da5ad7ff530] CLEANUP: ssl: remove old TODO commentary
git bisect bad 34779c34fcb483b91339a1c4c8d74da5ad7ff530
# bad: [e40f274878eb70946a1792f5ef142ec0d57ac9c4] BUILD: trace: make the lockon_ptr const to silence a warning without threads
git bisect bad e40f274878eb70946a1792f5ef142ec0d57ac9c4
# bad: [2ab5c38359340c52abce3516e572b838a30b1754] BUG/MINOR: checks: do not exit tcp-checks from the middle of the loop
git bisect bad 2ab5c38359340c52abce3516e572b838a30b1754
# bad: [37243bc61f9c5cf88d1fe96a016e5f2f7e5e0c60] BUG/MEDIUM: mux-h1: Don't release h1 connection if there is still data to send
git bisect bad 37243bc61f9c5cf88d1fe96a016e5f2f7e5e0c60
# bad: [ad03288e6b28d816abb443cf8c6d984a72bb91a6] BUG/MINOR: mworker/cli: don't output a \n before the response
git bisect bad ad03288e6b28d816abb443cf8c6d984a72bb91a6
# bad: [da29fe2360e61a5ed4acd283765b20addd5a3ea8] MEDIUM: server: server-state global file stored in a tree
git bisect bad da29fe2360e61a5ed4acd283765b20addd5a3ea8
# good: [d4376302377e4f51f43a183c2c91d929b27e1ae3] MINOR: sample: Add sha2([<bits>]) converter
git bisect good d4376302377e4f51f43a183c2c91d929b27e1ae3
# first bad commit: [da29fe2360e61a5ed4acd283765b20addd5a3ea8] MEDIUM: server: server-state global file stored in a tree
I guess the patch proposed by @chipitsine to fix the issue #660 should fix this one too.
I planned to polish commit message on weekend :)
With the master-worker, It's better to flag the temporary FDs with FD_CLOEXEC to prevent a potential leak during the reload.
@capflam is this fixed in 2.2 stable release?
Yes, it was originally commit dc6e8a9a7 and d74774bc in 2.2.1.