Envoy: Customizing backlog size

Created on 27 Dec 2019  路  8Comments  路  Source: envoyproxy/envoy

There seems to be no way to customize backlog size of a socket and it is fixed to 128.
https://github.com/envoyproxy/envoy/blob/7ff7cb4c6a1dd62e43ad9aeaee98bb103971ba6a/source/common/network/listener_impl.cc#L52
https://github.com/libevent/libevent/blob/master/listener.c#L180

It would be nice if you can provide an option to change it.

arelistener help wanted

Most helpful comment

@surki sorry this has been on my list to finish for a while. I will try to finish it soon.

All 8 comments

Hi @mattklein123, we've been investigating some issues one of our high throughput services was having, and found the cause to be this low fixed backlog size. A particular problem with this is, when the queue is full and new connections are rejected, envoy does not seem to be aware of the problem and no error metrics are emitted. So we were flying blind and had to rely on OS stats to find the cause.

Unfortunately this could be a stopper for us to continue the rollout of envoy to our higher tier services. My team is not very familiarized with the envoy codebase yet, but we could try to fix this and raise a PR so it hopefully makes it in time to be included in the 1.14.0 release.

It seems, the only fix needed here is a user setting to make its way to the evconnlistener_new's backlog argument, keeping -1 as default for backwards compatibility.

Also, to avoid back and forth with the PR:

  • would you expect this to be a command line arg (applied to all the listeners)? or a listener specific setting we should add to the listener.proto? do you think tcp_backlog_size would be a good name for it?
  • secondly, I would appreciate if someone could point me to a PR implementing a similar feature (e.g. introducing a user controlled connection setting), so it's easier for us to understand how to test and implement the fix.

cc @paulnivin who has a partially done patch for this that maybe you can finish. Paul can you post what you have?

https://github.com/paulnivin/envoy/tree/listen_somaxconn is the branch I had been working on, but I haven't had cycles of late to finish it up and add tests. Diff @ https://github.com/paulnivin/envoy/compare/master...paulnivin:listen_somaxconn?expand=1

My initial approach was to just have envoy use the OS tunable backlog (easier to implement and solves the high throughput case) and a later version, if needed, could plumb through an application specific setting/override for the backlog.

In a pinch, for testing or urgent situations, it's possible to use LD_PRELOAD to override the backlog without any source changes or recompiling (e.g. https://access.redhat.com/solutions/3314151).

@paulnivin wondering if you are planning to rollout this?
Currently using LD_PRELOAD, just trying to get out of it.

@surki sorry this has been on my list to finish for a while. I will try to finish it soon.

cc @florincoras who I think that volunteered (been volunteered?) to fix this. @florincoras I would add a config option on the Listener proto object which I think should be pretty easy to plumb through to where you need it.

If unset, I would probably do what @paulnivin did here: https://github.com/paulnivin/envoy/compare/master...paulnivin:listen_somaxconn?expand=1 which is to make it -1 on linux to use the kernel default, otherwise allow it to be explicitly configurable.

@mattklein123 :-) Will do once we're done with #12547!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

hzxuzhonghu picture hzxuzhonghu  路  3Comments

justConfused picture justConfused  路  3Comments

phlax picture phlax  路  3Comments

jeremybaumont picture jeremybaumont  路  3Comments

sabiurr picture sabiurr  路  3Comments