Haproxy: Possible memleak through spoe_appctx pool

Created on 1 Feb 2019  路  5Comments  路  Source: haproxy/haproxy

Output of haproxy -vv and uname -a

$ haproxy -vv
HA-Proxy version 1.8.17-1ppa1~xenial 2019/01/15
Copyright 2000-2019 Willy Tarreau <[email protected]>

Build options :
  TARGET  = linux2628
  CPU     = generic
  CC      = gcc
  CFLAGS  = -O2 -g -O2 -fPIE -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv -Wno-unused-label
  OPTIONS = USE_GETADDRINFO=1 USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1 USE_LUA=1 USE_SYSTEMD=1 USE_PCRE2=1 USE_PCRE2_JIT=1 USE_NS=1

Default settings :
  maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with OpenSSL version : OpenSSL 1.0.2g  1 Mar 2016
Running on OpenSSL version : OpenSSL 1.0.2g  1 Mar 2016
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2
Built with Lua version : Lua 5.3.1
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Encrypted password support via crypt(3): yes
Built with multi-threading support.
Built with PCRE2 version : 10.21 2016-01-12
PCRE2 library supports JIT : yes
Built with zlib version : 1.2.8
Running on zlib version : 1.2.8
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with network namespace support.

Available polling systems :
      epoll : pref=300,  test result OK
       poll : pref=200,  test result OK
     select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

Available filters :
    [SPOE] spoe
    [COMP] compression
    [TRACE] trace
$ uname -a
Linux openio-1 4.15.0-38-generic #41~16.04.1-Ubuntu SMP Wed Oct 10 20:16:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

What's the configuration?

$ cat /etc/haproxy/haproxy.cfg
global
  log 127.0.0.1:514 local5 info
  chroot /var/lib/haproxy

  user haproxy
  group haproxy
  daemon

  stats socket /run/haproxy/stats.sock mode 0660 level admin
  stats timeout 30s

  # SSL
  ca-base /etc/ssl/certs
  crt-base /etc/ssl/private
  # Default ciphers to use on SSL-enabled listening sockets.
  # For more information, see ciphers(1SSL). This list is from:
  #  https://hynek.me/articles/hardening-your-web-servers-ssl-ciphers/
  ssl-default-bind-ciphers ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256
  ssl-default-bind-options no-sslv3 no-tlsv10 no-tlsv11 no-tls-tickets
  ssl-default-server-ciphers ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256
  ssl-default-server-options no-sslv3 no-tlsv10 no-tlsv11 no-tls-tickets

defaults
  log global
  unique-id-format %{+X}o\ %ci:%cp_%fi:%fp_%Ts_%rt:%pid
  unique-id-header X-Unique-ID
  mode http
  option  httplog clf
  option  dontlognull
  option  log-separate-errors
  option  log-health-checks
  option  http-server-close
  timeout connect 5s
  timeout client  60s
  timeout server  60s
  timeout http-request 15s
  timeout queue 1m
  timeout http-keep-alive 10s
  timeout check 10s
  timeout tunnel 1h
  errorfile 400 /etc/haproxy/errors/400.http
  errorfile 403 /etc/haproxy/errors/403.http
  errorfile 408 /etc/haproxy/errors/408.http
  errorfile 500 /etc/haproxy/errors/500.http
  errorfile 502 /etc/haproxy/errors/502.http
  errorfile 503 /etc/haproxy/errors/503.http
  errorfile 504 /etc/haproxy/errors/504.http

###########################
# Listen Definition
###########################
listen stats
  stats enable
  stats uri /stats
  stats realm Haproxy\ Statistics
  stats auth [REDACTED]
  stats admin if TRUE
  bind [IP]:[PORT]
  mode http
  timeout client 5000
  timeout connect 4000
  timeout server 30000
  balance
# end listen stats

###########################
# Frontend Definition
###########################

frontend frontend1
  log-format "%ci:%cp [%t] %ft %b/t=%Tt q=%Tq w=%Tw c=%Tc r=%Tr/%ST %B %CC %CS %tsc %ac/%fc/%bc/%sc/%rc %sq/%bq %hr %hs %{+Q}r %ID"
  default_backend backend1
  bind [IP]:[PORT]
  reqadd X-Forwarded-Proto:\ http
  mode http
  option forwardfor
# end frontend frontend1

frontend frontend2
  log-format "%ci:%cp [%t] %ft %b/t=%Tt q=%Tq w=%Tw c=%Tc r=%Tr/%ST %B %CC %CS %tsc %ac/%fc/%bc/%sc/%rc %sq/%bq %hr %hs %{+Q}r %ID"
  default_backend backend2
  bind [IP]:[PORT]
  reqadd X-Forwarded-Proto:\ http
  mode http
  option forwardfor
# end frontend frontend2

frontend frontend3
  log-format "%ci:%cp [%t] %ft %b/t=%Tt q=%Tq w=%Tw c=%Tc r=%Tr/%ST %B %CC %CS %tsc %ac/%fc/%bc/%sc/%rc %sq/%bq %hr %hs %{+Q}r %ID"
  default_backend backend3
  bind [IP]:[PORT]
  reqadd X-Forwarded-Proto:\ http
  mode http
  option forwardfor
# end frontend frontend3

frontend frontend4
  bind [IP]:[PORT] name service
  default_backend backend4
  log-format "%ci:%cp [%t] %ft %b/t=%Tt w=%Tw c=%Tc/%B %ts %ac/%fc/%bc/%sc/%rc %sq/%bq"
  mode tcp
# end frontend frontend4

###########################
# Backend Definition
###########################

backend backend1
  balance roundrobin
  mode http
  server service1 [IP]:[PORT] check inter 5s
  [...]
# end backend backend1

backend backend2
  balance roundrobin
  mode http
  server service2 [IP]:[PORT] check inter 5s
  [...]
# end backend backend2

backend backend3
  balance roundrobin
  mode http
  server service3 [IP]:[PORT] check inter 5s
# end backend backend3

backend backend4
  server service4 [IP]:[PORT] check inter 5s
  server service42 [IP]:[PORT] check inter 5s backup
  mode tcp
  option tcp-check
# end backend backend4

Steps to reproduce the behavior

  1. Ask for pool information at intervals
echo "show pools" | nc -U /var/run/haproxy/stats.sock
Dumping pools usage. Use SIGQUIT to flush them.
  - Pool cache_st (16 bytes) : 0 allocated (0 bytes), 0 used, 0 failures, 1 users [SHARED]
  - Pool pipe (32 bytes) : 5 allocated (160 bytes), 5 used, 0 failures, 2 users [SHARED]
  - Pool email_alert (48 bytes) : 17 allocated (816 bytes), 1 used, 0 failures, 4 users [SHARED]
  - Pool tcpcheck_ru (64 bytes) : 0 allocated (0 bytes), 0 used, 0 failures, 5 users [SHARED]
  - Pool spoe_appctx (128 bytes) : 2043604 allocated (261581312 bytes), 2043604 used, 0 failures, 3 users [SHARED]
  - Pool spoe_ctx (144 bytes) : 0 allocated (0 bytes), 0 used, 0 failures, 1 users [SHARED]
  - Pool h2s (160 bytes) : 51 allocated (8160 bytes), 33 used, 0 failures, 3 users [SHARED]
  - Pool h2c (240 bytes) : 0 allocated (0 bytes), 0 used, 0 failures, 1 users [SHARED]
  - Pool http_txn (288 bytes) : 5 allocated (1440 bytes), 0 used, 0 failures, 1 users [SHARED]
  - Pool connection (384 bytes) : 17 allocated (6528 bytes), 2 used, 0 failures, 1 users [SHARED]
  - Pool hdr_idx (416 bytes) : 5 allocated (2080 bytes), 0 used, 0 failures, 1 users [SHARED]
  - Pool dns_resolut (480 bytes) : 0 allocated (0 bytes), 0 used, 0 failures, 1 users [SHARED]
  - Pool dns_answer_ (576 bytes) : 0 allocated (0 bytes), 0 used, 0 failures, 1 users [SHARED]
  - Pool stream (848 bytes) : 10 allocated (8480 bytes), 1 used, 0 failures, 1 users [SHARED]
  - Pool requri (1024 bytes) : 2 allocated (2048 bytes), 0 used, 0 failures, 1 users [SHARED]
  - Pool trash (16400 bytes) : 0 allocated (0 bytes), 0 used, 0 failures, 1 users
  - Pool buffer (16408 bytes) : 8 allocated (131264 bytes), 2 used, 0 failures, 1 users [SHARED]
Total: 17 pools, 261742288 bytes allocated, 261621232 used.
echo "show pools" | nc -U /var/run/haproxy/stats.sock
Dumping pools usage. Use SIGQUIT to flush them.
  - Pool cache_st (16 bytes) : 0 allocated (0 bytes), 0 used, 0 failures, 1 users [SHARED]
  - Pool pipe (32 bytes) : 5 allocated (160 bytes), 5 used, 0 failures, 2 users [SHARED]
  - Pool email_alert (48 bytes) : 17 allocated (816 bytes), 1 used, 0 failures, 4 users [SHARED]
  - Pool tcpcheck_ru (64 bytes) : 0 allocated (0 bytes), 0 used, 0 failures, 5 users [SHARED]
  - Pool spoe_appctx (128 bytes) : 2048491 allocated (262206848 bytes), 2048491 used, 0 failures, 3 users [SHARED]
  - Pool spoe_ctx (144 bytes) : 0 allocated (0 bytes), 0 used, 0 failures, 1 users [SHARED]
  - Pool h2s (160 bytes) : 51 allocated (8160 bytes), 33 used, 0 failures, 3 users [SHARED]
  - Pool h2c (240 bytes) : 0 allocated (0 bytes), 0 used, 0 failures, 1 users [SHARED]
  - Pool http_txn (288 bytes) : 5 allocated (1440 bytes), 0 used, 0 failures, 1 users [SHARED]
  - Pool connection (384 bytes) : 17 allocated (6528 bytes), 2 used, 0 failures, 1 users [SHARED]
  - Pool hdr_idx (416 bytes) : 5 allocated (2080 bytes), 0 used, 0 failures, 1 users [SHARED]
  - Pool dns_resolut (480 bytes) : 0 allocated (0 bytes), 0 used, 0 failures, 1 users [SHARED]
  - Pool dns_answer_ (576 bytes) : 0 allocated (0 bytes), 0 used, 0 failures, 1 users [SHARED]
  - Pool stream (848 bytes) : 10 allocated (8480 bytes), 1 used, 0 failures, 1 users [SHARED]
  - Pool requri (1024 bytes) : 2 allocated (2048 bytes), 0 used, 0 failures, 1 users [SHARED]
  - Pool trash (16400 bytes) : 0 allocated (0 bytes), 0 used, 0 failures, 1 users
  - Pool buffer (16408 bytes) : 8 allocated (131264 bytes), 2 used, 0 failures, 1 users [SHARED]
Total: 17 pools, 262367824 bytes allocated, 262246768 used.

Actual behavior

Pool spoe_appctx has grown in size, although spoe isn't even active. RAM usage keeps rising until OOM occurs.

Below is the graph showing memory consumption of haproxy process.

haproxy_memory

Expected behavior

RAM usage doesn't grow, and spoe_appctx pool size neither.

Do you have any idea what may have caused this?

No

Do you have an idea how to solve the issue?

No :(

medium fixed http bug

Most helpful comment

just merged Olivier's fix.

All 5 comments

I can confirm the issue. It appears that the unique ID is related to this. I made this change:

diff --git i/src/memory.c w/src/memory.c
index 4290c4b3..1888020e 100644
--- i/src/memory.c
+++ w/src/memory.c
@@ -107,6 +107,9 @@ struct pool_head *create_pool(char *name, unsigned int size, unsigned int flags)
  */
 void *__pool_refill_alloc(struct pool_head *pool, unsigned int avail)
 {
+       if (memcmp(pool->name, "spoe_appctx", 6) == 0) {
+               abort();
+       }
        void *ptr = NULL;
        int failed = 0;

and received the following abort:

==31060==
==31060== Process terminating with default action of signal 6 (SIGABRT)
==31060==    at 0x5B8C428: raise (raise.c:54)
==31060==    by 0x5B8E029: abort (abort.c:89)
==31060==    by 0x4E71E5: __pool_refill_alloc (memory.c:111)
==31060==    by 0x42C6D9: pool_alloc_dirty (memory.h:154)
==31060==    by 0x42C6D9: pool_alloc (memory.h:229)
==31060==    by 0x42C6D9: http_process_request (proto_http.c:3770)
==31060==    by 0x45F736: process_stream (stream.c:1909)
==31060==    by 0x4DFAD2: process_runnable_tasks (task.c:229)
==31060==    by 0x48EC9A: run_poll_loop (haproxy.c:2415)
==31060==    by 0x48EC9A: run_thread_poll_loop (haproxy.c:2481)
==31060==    by 0x409929: main (haproxy.c:3084)
==31060==

Hi,

I confirm it is likely related to the unique-id pool. I can't yet reproduce it, though, I'm still investigating.

just merged Olivier's fix.

If you can, please apply the attached patch, adapted for 1.8.

From 2981fd47770ca1b64d14e7562abf2e7eab2d2108 Mon Sep 17 00:00:00 2001
From: Olivier Houchard <[email protected]>
Date: Fri, 1 Feb 2019 18:10:46 +0100
Subject: [PATCH] BUG/MEDIUM: stream: Don't forget to free s->unique_id in
 stream_free().

In stream_free(), free s->unique_id. We may still have one, because it's
allocated in log.c::strm_log() no matter what, even if it's a TCP connection
and thus it won't get free'd by http_end_txn().
Failure to do so leads to a memory leak.

This should probably be backported to all maintained branches.
---
 src/stream.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/stream.c b/src/stream.c
index 507cd2f4..b40a1bca 100644
--- a/src/stream.c
+++ b/src/stream.c
@@ -339,6 +339,8 @@ static void stream_free(struct stream *s)
        offer_buffers(NULL, tasks_run_queue + applets_active_queue);
    }

+   pool_free(pool_head_uniqueid, s->unique_id);
+
    hlua_ctx_destroy(s->hlua);
    s->hlua = NULL;
    if (s->txn)
-- 
2.20.1

If not, a workaround would be to move unique-id-format and unique-id-header from defaults to each HTTP frontends.

Olivier

Moving the unique-id options to each frontend does seem to work-around the issue. Thank you!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Elufimov picture Elufimov  路  6Comments

albix picture albix  路  6Comments

abra7134 picture abra7134  路  3Comments

KlavsKlavsen picture KlavsKlavsen  路  4Comments

allentc picture allentc  路  7Comments