I have used caddy as proxy engine for a period of time. Recently I find the free memory of the device is decreasing slowly.
The corresponding scenario is like this: At the device, we launch a process(named agent) which provides interface for site configuration(add, delete or change) and then notify caddy to reload new configurations through sending the USR1 signal.
After caddy running one week, we can get the top memory-consumed functions and list below:
flat flat% sum% cum cum%
1284.11MB 12.44% 12.44% 1284.11MB 12.44% github.com/mholt/caddy/caddyfile(parser).directive
856.70MB 8.30% 20.74% 1195.28MB 11.58% github.com/mholt/caddy/vendor/golang.org/x/net/http2.configureTransport
706.98MB 6.85% 27.59% 706.98MB 6.85% bytes.makeSlice
577.11MB 5.59% 33.18% 1772.39MB 17.17% github.com/mholt/caddy/caddyhttp/proxy.NewSingleHostReverseProxy
573.15MB 5.55% 38.73% 3151.70MB 30.53% github.com/mholt/caddy/caddyhttp/proxy.NewStaticUpstreams
430.62MB 4.17% 52.63% 440.62MB 4.27% net/textproto.MIMEHeader.Add
409.11MB 3.96% 56.59% 409.11MB 3.96% github.com/mholt/caddy/caddytls.NewConfig
365.13MB 3.54% 60.13% 886.57MB 8.59% github.com/mholt/caddy/caddyhttp/httpserver.(httpContext).InspectServerBlocks
The function "proxy.NewStaticUpstreams" has taken up so much memory. I doubt if there is
memory leak. After reading the code of the function “NewStaticUpstreams”, I still can't find the reason.
As a comparison,I restart the caddy process and find the function "proxy.NewStaticUpstreams" only consume about 3M.
caddy -version)?0.11.0
add new site periodicity and then caddy reload new site configuration
site example:
http://www.testexample.com {
proxy / 10.67.8.123 {
websocket
transparent
keepalive 100
timeout 3m
}
}
total site number: about 4000
caddy -conf ./Caddyfile
NULL
After long-time running, the free memory is not decreased.
The free memory of the device is decreased.
(1) caddy running as proxy engine and on-line traffic
(2) add new site or modify the existed site configuration
(3) caddy reload new site configuration
(4) site number as many as possible, mine about 4000
First, please upgrade to the latest version (0.11.3 as of now) and see if that shows any improvements.
Does the memory usage grow with each USR1 or does it grow without USR1?
First, please upgrade to the latest version (0.11.3 as of now) and see if that shows any improvements.
Does the memory usage grow with each USR1 or does it grow without USR1?
Without USR1 memory is normal(not increase).
Good to know; we'll have to look into why the resources aren't getting freed up.
Related to #2358?
@magikstm Yes, I think so, at least in the sense that the goroutine leaked. I merged the fix for that just now.
However, I think memory leak is still a problem. It grows with the size of the Caddyfile, interestingly. It's the parsing of directives that really does it. Not sure why yet. If someone wants to dig in and do some profiling and find out why executeDirectives is using so much memory after reloads, that'd be helpful!
Good to know; we'll have to look into why the resources aren't getting freed up.
Thanks.
Related to #2358?
However issue #2466 is a new problem. If there is any clue, that will be helpful. Thanks
I can confirm that there is some increase of memory usage after USR1 reloads which scales with the size of the Caddyfile.
I can confirm that there is some increase of memory usage after USR1 reloads which scales with the size of the Caddyfile.
Besides the size of the Caddyfile, I think the on-line real traffic is another factor that causes memory to increase. Without traffic I start the caddy process at the isolated test device and send the USR1 signal periodically by automated script. After several days memory used is stable relatively.
@mholt
Hi,mholt
At my device I see many keep-alive connections which opened to the backend are active.
Caddy only support keep-alive connection number currently. In other words,if backend server not close the keep-alive connection from caddy after an idle timeout, this connection will be active forever.
After reloading the new configuration file, last keep-alive connections are not closed immediately. These not closed keep-alive connections will cause memory not to be garbage-collected.
Could it explain why the function "proxy.NewStaticUpstreams" use so much memory after reloads? However, I’m not certain about this problem.
Follow is my commited code. Thank your review!
https://github.com/chengsurf/caddy/commit/8794949bdbcbf3252561332621f1f3f0c8e20faa
Thanks for the investigation, @chengsurf, and for the PR! I won't have the time to investigate deeply and evaluate it in my foreseeable near-future, though, so I encourage you or others to dive deeper and do some measurements and see if that is the problem or not.
I'm guessing if it is part of the problem, it is not the whole problem: I can still see memory usage increase without the proxy directive and without traffic to the server, just by flooding it with USR1 signals.
@mholt
Hi,mholt:
I have a question: when caddy reloads the new caddyfile, first it gracefully shuts down the server without interrupting any active connections.
However it cannot stop the websocket connection that proved by self-test result. I wonder if the websocket connection should be closed after USR1 reloads.
Caddy 2's config reload mechanism does not have any known memory leaks and is much faster & more efficient. Since Caddy 2 is around the corner (already in beta - try it out!), I don't anticipate that I'll invest much energy in fixing this in Caddy 1, as it's really difficult to pinpoint. Anyway, this is fixed in Caddy 2.
Most helpful comment
Related to #2358?