We started using threads to manage memory efficiently. Our setup changed from 5 workers 1 threads to 1 worker 5 threads. Problem is that with gunicorn(v19.9.0) our memory usage goes up all the time and gunicorn is not releasing the memory which has piled up from incoming requests. We hit the limit in our pods and worker starts again. We are currently testing max_requests with 2 workers with 4 threads. We also upgraded to latest version but max_request argument didn't work. Is this how the worker is supposed to behave? Not releasing the memory until it restarts?
Gunicorn should not leak memory. Does it only happen with the threaded worker? Can you test your application outside your production environment with the synchronous worker and the threaded worker and report back about memory consumption?
Sometimes, Python memory usage can be misleading. Python will recycle objects without freeing them back to the OS if it thinks it will need them again. However, memory usage should stabilize at some level unless there is a real leak.
Yes it happened when we switched to threaded worker. The memory issue happens only when we get a lot of requests to our pods. We currently using with max_request set to 30.000 with 2 workers and 4 threads. Memory limit 4 gb so we see pods are starting and going up to 3gb and then restarting becase we are forcing them to restart. Also with max_request_jitter we kind okay that they restart at different time not same but the current behaviour is not health. Memory always increases with incoming requests and unless there is not break between memory will keep going up. The issue here the request are somehow stored in memory and not released until there is a timeout if im not wrong. I will do local testing along with other frameworks or perhaps with other types of workers(i.e. uvicorn). I will let you know about the results but this is the current status with gthread. Thanks for getting back to me btw!
There are quite a few memory profiling tools mentioned over on SO: https://stackoverflow.com/questions/110259/which-python-memory-profiler-is-recommended
I'll be very curious if switching workers helps, because then we should investigate the threaded worker. Hopefully, it is something simple you can find in your own code or in a library you use.
We started using threads to manage memory efficiently. Our setup changed from 5 workers 1 threads to 1 worker 5 threads. Problem is that with gunicorn(v19.9.0) our memory usage goes up all the time and gunicorn is not releasing the memory which has piled up from incoming requests.
which memory part is used? Is there any possibility you can share an example of app?
There are quite a few memory profiling tools mentioned over on SO: https://stackoverflow.com/questions/110259/which-python-memory-profiler-is-recommended
I'll be very curious if switching workers helps, because then we should investigate the threaded worker. Hopefully, it is something simple you can find in your own code or in a library you use.
We have tried to change thread worker to uvicorn worker and now it takes around 2 hours to hit our memory limit before it was 30 mins at peak times. This also may have some effects on our scaling strategy but i would assume since we are using sync worker it should not be an issue. We should only see latency increase. Am I wrong @tilgovi @benoitc? We are implementing the memory API endpoint to detect which python objects are consuming more memory compared pod startup. I will share the results here once I have them.
We started using threads to manage memory efficiently. Our setup changed from 5 workers 1 threads to 1 worker 5 threads. Problem is that with gunicorn(v19.9.0) our memory usage goes up all the time and gunicorn is not releasing the memory which has piled up from incoming requests.
which memory part is used? Is there any possibility you can share an example of app?
What do you mean by which memory part?
After months of digging we found that it was the logging that was causing memory leaks when threaded workers enable for gunicorn. So we fixed that and everything is working fine. Thanks for the help guys!
For posterity, are you able to share some details about what part of logging was causing the problem and how you fixed it?
@fazilhero
are you able to share some details about what part of logging was causing the problem and how you fixed it?
@fazilhero Please tell us how you fixed this?
Most helpful comment
For posterity, are you able to share some details about what part of logging was causing the problem and how you fixed it?