In django project, I am using Gunicorn as an application server. Last few days application is running smoothly for few hours after that application is hanged. Gunicorn is utilizing more memory, due to this CPU utilization crossed to 95% and application is hanged.
Here is my Gunicorn configuration while starting application.
gunicorn base.wsgi:application
--bind=0.0.0.0:8000
--pid=logs/project/gunicorn.log
--access-logfile=logs/project/access.log
--workers=7
--worker-class=gevent
--error-logfile=logs/project/error8.log
--timeout=4500
--preload
Please help me on this issue.
you maybe need to decrease the workers, try this formula 2 * CPUs + 1, your server maybe a single CPU machine, so set it to 3 is safe.
any log to share? How to reproduce? By itself gunicorn don't use much ram or cpu. Also why are you preloading the app?
@syaiful6 Currently 4 CPUs are process on server. according to you i have to set --workers=4 right?
@benoitc At a time 12-15 users are logged-in the system and doing some activity, gunicorn caught more memory due to this CPU utilization is more and system is hung.
is there a wat ri reproduce the issue? Gunicorn by itself doesn't use much cpu, it's more likely smth in your app is triggering the issue.
bump.
just set max-requests
and have the worker process be cleaned up after a while, that should cap whatever memory leak you have
no answer since by the issuer closing the issue then.
@hardiksanchawat you shouldn't use preload until you really need to use preload, which may be one of the reason. Also likesaid by @diwu1989 you can use the max-requests setting to prevent any memory leak.
We have hosted our Web App (Angular CLI: 1.7.3, Node: 9.5.0, Django 2.0.3) on AWS free tier of Ubuntu(14.04.5 LTS) VM. We are using ELB (Elastic Load Balancers) with Nginx-1.4.6 and gunicorn-19.7.1
When we try to use access our App from a browser then suddenly gunicorn memory usage increases quite high:
Normal:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
16248 ubuntu 20 0 313.1m 94.6m 9.3m R 15.6 9.5 0:00.71 gunicorn
While using App:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
16235 ubuntu 20 0 684.9m 315.8m 11.1m R 61.4 31.8 0:04.51 gunicorn
Below is our gunicorn configuration (/etc/init/gunicorn.conf):
description "Gunicorn application server handling myproject"
start on runlevel [2345]
stop on runlevel [!2345]
respawn
setuid ubuntu
setgid www-data
chdir /home/ubuntu/project/
exec ./env/bin/gunicorn --max-requests 1 --workers 3 --bind unix:/home/ubuntu/project/django-ng.sock config.wsgi:application
We have already set max-requests as 1 and workers as 3. Could someone please tell what is going wrong?
@satendrapratap I'm not sure anything is wrong. You might have to profile your app, but it's also possible that this is Python process on a 1GB machine and you have three workers. It may be normal. It's hard to say.
Ok. Which profiler should we use to profile?
Same code is running well with "python3 manage.py runserver" and we don't see any increase in memory used over a period of time or at once. Most probably there are no memory leaks otherwise there would have been increased memory consumption in this case also.
How would profiling actually help? Are you saying that we need to optimize the code to make it consume less memory?
but it's also possible that this is Python process on a 1GB machine and you have three workers.
How much memory does a Python process takes at runtime? In our case, gunicorn takes as much as 500-600 MB with just 1 max-request and if we increase it to 2 then accessing app sometimes gives out of memory error.
@satendrapratap profiling would help if there were a place you could reduce your own application's memory usage.
It would also help to know if Gunicorn is using more memory than it should, or if there is a leak with Gunicorn that does not happen with runserver.
I just did some quick tests of minimal applications at it seems to be as small as 20-30MiB for a worker, but for a real application it depends a lot on the application and the framework.
I am interested in any comparison between Gunicorn and runserver you can provide.
In our case, gunicorn takes as much as 500-600 MB with just 1 max-request and if we increase it to 2 then accessing app sometimes gives out of memory error.
Note that Python has its own memory allocator and doesn't give previously allocated memory blocks back to OS so increased memory usage doesn't always mean that there is a memory leak.
Do you see the same memory usage on all URLs or just a specific URL (/products/all for example)?
I'd also suggest deploying a simple Django app (same database and Gunicorn configuration; just one model and a couple of views) to see whether Gunicorn is the cause of the high memory usage.
It would also help to know if Gunicorn is using more memory than it should, or if there is a leak with Gunicorn that does not happen with runserver.
We will definitely try profiling. Could you please suggest a profiler to use in such scenario?
I just did some quick tests of minimal applications at it seems to be as small as 20-30MiB for a worker,
We have 3 workers and while app isn't being used, its consuming around 96 MB :
ubuntu@:~$ sudo python3 ps_mem.py
Private + Shared = RAM used Program
136.0 KiB + 9.0 KiB = 145.0 KiB acpid
180.0 KiB + 24.0 KiB = 204.0 KiB atd
220.0 KiB + 38.0 KiB = 258.0 KiB upstart-udev-bridge
240.0 KiB + 34.0 KiB = 274.0 KiB cron
236.0 KiB + 39.0 KiB = 275.0 KiB upstart-socket-bridge
288.0 KiB + 39.5 KiB = 327.5 KiB upstart-file-bridge
604.0 KiB + 50.0 KiB = 654.0 KiB systemd-udevd
544.0 KiB + 147.0 KiB = 691.0 KiB systemd-logind
684.0 KiB + 58.0 KiB = 742.0 KiB dbus-daemon
940.0 KiB + 56.0 KiB = 996.0 KiB rsyslogd
1.0 MiB + 131.5 KiB = 1.2 MiB getty (7)
896.0 KiB + 400.5 KiB = 1.3 MiB sudo
1.4 MiB + 73.0 KiB = 1.5 MiB init
2.5 MiB + 39.5 KiB = 2.6 MiB dhclient
2.3 MiB + 1.5 MiB = 3.8 MiB nginx (5)
2.5 MiB + 1.8 MiB = 4.3 MiB sshd (3)
6.4 MiB + 176.5 KiB = 6.6 MiB bash
50.1 MiB + 83.0 KiB = 50.2 MiB mysqld
ubuntu@:~$
ps_mem.py: https://raw.githubusercontent.com/pixelb/ps_mem/master/ps_mem.py
Python has its own memory allocator and doesn't give previously allocated memory blocks back to OS so increased memory usage doesn't always mean that there is a memory leak.
Fine but is there any way in python we can ask unused memory to be released?
Do you see the same memory usage on all URLs or just a specific URL (/products/all for example)?
Looks like its similar (not much difference) memory usage for different urls:
URL to get some small data:
19181 ubuntu 20 0 553.8m 281.4m 10.0m R 32.6 28.4 0:01.21 gunicorn
19191 ubuntu 20 0 97.8m 34.1m 2.9m S 7.7 3.4 0:00.23 gunicorn
URL to get little more data:
19196 ubuntu 20 0 97.8m 34.1m 2.9m S 7.3 3.4 0:00.22 gunicorn
19191 ubuntu 20 0 564.0m 296.1m 10.0m R 56.2 29.8 0:01.92 gunicorn
19198 ubuntu 20 0 97.8m 34.1m 2.9m S 7.7 3.4 0:00.23 gunicorn
URL makes server code do some processing, structure the data and then send back:
19208 ubuntu 20 0 97.8m 34.1m 2.9m S 7.7 3.4 0:00.23 gunicorn
19196 ubuntu 20 0 686.4m 319.3m 11.2m R 79.6 32.2 0:05.20 gunicorn
19206 ubuntu 20 0 313.1m 94.8m 9.4m R 16.6 9.6 0:00.74 gunicorn
I'd also suggest deploying a simple Django app (same database and Gunicorn configuration; just one model and a couple of views) to see whether Gunicorn is the cause of the high memory usage.
Will try this but its more or less clear from the previous point (comparing memory consumption for different url hits) that its consuming similar memory even for simple rest service hit
how is memory consumption affected by increasing workers and max-requests? I am asking this because max-request = 1 is not going to be used in case of production and we will have to tune it and workers as well.
so how much memory is usually required for small application deployed with Gunicorn and Nginx (20-30 rest services, only 4-5 out of them do some processing on the server while rest just takes data from the database and send back to the user, being hit by at least 10000 users simultaneously) ?
Fine but is there any way in python we can ask unused memory to be released?
You'd need to write a custom memory allocator by using Python C API but that's a non-trivial task and I don't think it's needed.
URL to get some small data:
19181 ubuntu 20 0 553.8m 281.4m 10.0m R 32.6 28.4 0:01.21 gunicorn
The table is a bit hard to read. I'm guessing the first column is PID, the second is owner of the process, but the rest is a bit hard to guess. Could you add column names too?
In any case, the increase of memory usage from ~90 MB to ~500 MB is too much. I think we would already getting a lot of similar reports if we had introduced such a huge memory leak by now :) (Gunicorn 19.7.1 was released more than a year ago.)
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
300MiB resident might not be unexpected, but I don't know. It is Python and an ORM, but I'm not a heavy Django user and haven't profiled these things before, myself.
For 10,000 users, depending on the request rate, you may need more than a t2.micro. However, we'll do our best to help you troubleshoot. Having nginx in front for queuing requests to Gunicorn will certainly help.
How much memory does the same application take when requesting the URLs against runserver?
We are using t2.micro which has 1 GB of ram. Our setup is like below:
CLIENT | AWS SERVER
|
| angular app, static files
| /
Web App<-------->| <----->ELB<--->nginx
| \
| gunicorn <-----> Backend Python Code
|
server {
listen 80 default_server;
server_name myapp.com www.myapp.com;
charset utf-8;
# max upload size
# client_max_body_size 75M;
client_max_body_size 0;
# send all non-media requests to the Django server.
location ~ ^/(myapp|admin) {
proxy_pass http://unix:/home/ubuntu/project/django-ng.sock; # for a file socket
include proxy_params;
}
location /static {
alias /home/ubuntu/project/static/;
}
location / {
root /home/ubuntu/deploy/angularapp;
index index.html index.htm;
try_files $uri $uri/ /index.html;
}
}
description "Gunicorn application server handling myproject"
start on runlevel [2345]
stop on runlevel [!2345]
respawn
setuid ubuntu
setgid www-data
chdir /home/ubuntu/project/
exec ./env/bin/gunicorn --max-requests 1 --workers 3 --bind unix:/home/ubuntu/project/django-ng.sock config.wsgi:application
How much memory does the same application take when requesting the URLs against runserver?
Using "python3 manage.py runserver" on local machine:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
Start App:
1158 pratap 20 0 900.4m 260.8m 24.5m S 0.0 3.2 1:52.17 python3
1156 pratap 20 0 80.9m 30.8m 30.2m S 0.0 0.4 0:00.50 python3
1158 pratap 20 0 900.4m 260.8m 24.5m S 0.0 3.2 1:52.17 python3
1158 pratap 20 0 900.4m 263.7m 24.9m S 37.0 3.3 1:53.28 python3
First Access Web App to do some processing:
1158 pratap 20 0 913.4m 294.0m 25.0m R 106.6 3.6 1:56.48 python3
1158 pratap 20 0 935.9m 345.1m 25.0m S 109.6 4.3 1:59.77 python3
1158 pratap 20 0 938.9m 349.2m 25.0m S 26.6 4.3 2:00.57 python3
1158 pratap 20 0 938.9m 349.2m 25.0m S 14.3 4.3 2:01.00 python3
1158 pratap 20 0 938.9m 349.2m 25.0m S 10.0 4.3 2:04.00 python3
1158 pratap 20 0 938.9m 349.2m 25.0m S 8.0 4.3 2:04.24 python3
1156 pratap 20 0 80.9m 30.8m 30.2m S 0.0 0.4 0:00.50 python3
1158 pratap 20 0 938.9m 349.2m 25.0m S 10.3 4.3 2:04.55 python3
1156 pratap 20 0 80.9m 30.8m 30.2m S 0.0 0.4 0:00.50 python3
1158 pratap 20 0 938.9m 349.2m 25.0m S 11.0 4.3 2:04.88 python3
1158 pratap 20 0 938.9m 349.2m 25.0m S 7.7 4.3 2:05.11 python3
Second Access Web App to do some processing:
1158 pratap 20 0 940.4m 351.0m 25.0m R 8.3 4.3 2:14.01 python3
1158 pratap 20 0 940.4m 351.0m 25.0m S 6.0 4.3 2:14.19 python3
1158 pratap 20 0 940.4m 351.0m 25.0m S 10.0 4.3 2:14.49 python3
1158 pratap 20 0 940.4m 351.0m 25.0m S 9.7 4.3 2:14.78 python3
1156 pratap 20 0 80.9m 30.8m 30.2m S 0.0 0.4 0:00.50 python3
1158 pratap 20 0 940.4m 351.0m 25.0m S 5.7 4.3 2:14.95 python3
1156 pratap 20 0 80.9m 30.8m 30.2m S 0.0 0.4 0:00.50 python3
1158 pratap 20 0 940.4m 351.0m 25.0m S 9.0 4.3 2:15.22 python3
1158 pratap 20 0 940.4m 351.0m 25.0m S 9.7 4.3 2:15.51 python3
1158 pratap 20 0 940.4m 351.0m 25.0m S 9.3 4.3 2:15.79 python3
Third Access Web App to do some processing:
1158 pratap 20 0 942.4m 353.7m 25.0m S 47.3 4.4 2:17.21 python3
1158 pratap 20 0 942.4m 353.7m 25.0m S 9.3 4.4 2:17.49 python3
1158 pratap 20 0 942.4m 353.7m 25.0m S 9.7 4.4 2:17.78 python3
1158 pratap 20 0 942.4m 353.7m 25.0m R 8.3 4.4 2:18.03 python3
1158 pratap 20 0 942.4m 353.7m 25.0m S 7.3 4.4 2:18.25 python3
1158 pratap 20 0 942.4m 353.7m 25.0m R 11.0 4.4 2:19.93 python3
1158 pratap 20 0 942.4m 355.1m 25.0m S 26.3 4.4 2:20.72 python3
1158 pratap 20 0 943.7m 355.4m 25.0m S 62.3 4.4 2:22.59 python3
Comparing these measurements to the earlier comment with Gunicorn memory stats, it seems like there is no problem with Gunicorn here. Am I misreading?
@tilgovi I am currently working on similar problem and I wanted to add my findings in case it may help.
I have 2GB RAM Ubuntu16 VM and running Python3, Flask web app. As I have using some NLP libraries initial memory usage is hight.
You can see last 30 days memory usage graph below. Actually it started %33, 3 months ago and still continue to increase (yesterday I have upgraded to last version of gunicorn and others, this is why graph is down and up.)
Workers set to 2.
I can not say it is caused by unicorn exactly but as I deploy every few days and restart service and reboot VM time to time I expect everything should start from around %33 again (if it caused by my code or any package). But it continue where it left.
So I assume something is caching things and do not leave them with reboot/restart.
Is there any method I can monitor what is causing memory increase under gunicorn as I can only see %60 of memory is consumed by gunicorn.
@tilgovi we went through the details and looks like gunicorn and runserver results of memory consumption can't be compared because we configured gunicorn for max-requests 1 which will restart worker after every request so we won't really be able to tell whether memory increase or not for subsequent rest service calls.
So set max-request as 100, workers as 3 and tried our experiment again. In idle situation gunicorn memory usage is fine:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
25678 ubuntu 20 0 644632 34868 10824 S 0.0 26.6 0:02.94 gunicorn
25680 ubuntu 20 0 100136 34876 2992 S 0.0 3.4 0:00.23 gunicorn
25677 ubuntu 20 0 100132 34868 2992 S 0.0 3.4 0:00.24 gunicorn
25673 ubuntu 20 0 64836 18368 3948 S 0.0 1.8 0:00.17 gunicorn
But we use the web app then worker consume more memory (maybe for python data structure etc)and after serving the request it does not return back (maybe because Python has its own memory allocator and doesn't give previously allocated memory blocks back to OS but would use this memory for further rest service requests) to normal memory consumption of 34868 :
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
25678 ubuntu 20 0 644632 270432 10824 S 0.0 26.6 0:02.94 gunicorn
25680 ubuntu 20 0 100136 34876 2992 S 0.0 3.4 0:00.23 gunicorn
25677 ubuntu 20 0 100132 34868 2992 S 0.0 3.4 0:00.24 gunicorn
25673 ubuntu 20 0 64836 18368 3948 S 0.0 1.8 0:00.17 gunicorn
but when start using the web app then workers are chosen randomly and when the third worker is chosen to serve the request (after other 2 workers have already served and hold 270428 KB memory each) then our system tries to allocate memory which isn't available enough and worker crashes and restarts. So looks like we can debug the memory consumption properly only if we have enough memory available to us the app enough number of times without crashing workers and restarting. Then only we can identify whether memory used increases over a period of time (leak) or returns back to the first case where a worker is consuming memory for itself and for python allocation (which is normal situation).
Worker crashes without completing the request because of less memory available:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
25677 ubuntu 20 0 636968 331944 4 R 5.2 32.7 0:02.30 gunicorn
25678 ubuntu 20 0 644632 259684 0 S 0.0 25.6 0:02.95 gunicorn
25680 ubuntu 20 0 644636 259668 0 S 0.3 25.6 0:02.97 gunicorn
1094 mysql 20 0 689704 49144 0 S 0.0 4.8 16:22.92 mysqld
25673 ubuntu 20 0 64836 14428 0 S 0.6 1.4 0:00.23 gunicorn
Forgive me, I'm still not fully understanding. It looks like Gunicorn uses ~300MiB. The only difference with max-requests is that you're more likely to have all three workers consuming ~300MiB each, rather than only the one that's currently handling a request.
On the other hand, runserver is also using ~300MiB per process.
It seems like your instance is not large enough to run three workers and it doesn't matter whether you use runserver or gunicorn.
If you want to handle more concurrent requests, you'll need to use an async worker type rather than start multiple workers.
Is there anything else I'm missing?
I think you are absolutely right and nothing is missed. Actually, the problem was that Gunicron documentation suggests to user number of workers as 2*N + 1 and because we were having a single core so we used 3 workers.
Now there is nothing mentioned about the manner these workers are used in. Looks like they are scheduled by OS and you never know which worker will handle incoming request. So initially what was happening was we had limited memory, out of which nginx and other system processes took some part then rest of the memory (~800 MB) was supposed to be used by workers. Because we configured for 3 workers so assuming that each worker takes 300 MB then all 3 together will consume 900 MB which isn't available so when the third worker (which you never know because its as per OS scheduling) is handling the request (after other 2 workers already handled and consuming total of 600 MB) then system doesn't have enough memory and worker crashes. This happened randomly so we configured max-request as 1 which was restarting worker after every request and freeing whatever memory was allocated for our python app. This way worker crash was overcome but system became slow and we thought its worker which suddenly takes huge memory and slows down the system and we even couldn't debug it as every time worker was restarted. To debug it more we had to have max-request big enough to serve more requests without restarting worker but that wasn't possible because if we allow more requests with 3 workers (each can take ~300 MB) in 1 GB system then worker was crashing and we were getting out of memory.
But while we made the number of workers as 1 then the system is behaving more or less like runserver. So now we are not getting out of memory and system is fast also.
So what we learned is that the general rule of creating 2*N+1 workers is just a vague logic as you need to consider memory as well.
Suppose each gunicorn worker takes ‘W’ memory and total system memory is ‘T’ and having N=1 core.
So as per the suggestion minimum number of workers = 2*1 + 1 = 3
Now suppose your application takes ‘A’ memory.
So total memory required with only one worker handling all requests R = W*3 + A
So as long as T is more enough than R, everything is fine but the problem comes when suppose Operating System schedule other workers to serve more requests then each worker consumes at least W+A memory. So actually system memory required for gunicorn with 3 workers should be more than (W+A)*3 to avoid random hangs, random no responses or random bad requests responses (for example nginx is used as reverse proxy then it will not get any response if gunicorn worker is crashing because of less memory and in turn nginx will respond with a Bad Request message)
So we need to be careful while configuring gunicorn and consider both numbers of core and memory.
I believe that Gunicorn documentation at http://docs.gunicorn.org/en/stable/settings.html
should mention about memory consumption as well while configuring workers.
Logic is not that vague ;) Having 2*cores + 1 allows good load balancing between workers. It's also true that each workers are isolated and will load the application independently and then consume some memory.
If you want to reduce the size of workers on a limited machine you can use the --preload
setting which will preload the application code and share it between worker.
also good point we should have a part about the memory in that design page.
gunicorn definitely eats lots of mem than it should, recently I switch to uwsgi save losts of memory..
@tyan4g gunicorn itself don't use much memory, it doesn't buffer and has a pretty low memory footprint for a python application. What is using the ram is generally the application and its usage. All workers are isolated and by default the memory is not shared. uwsgi has a similar approach so it shouldn't impact the memory much. Note that using the gthread or any other async worker allows you to reduce the number of workers if needed.
I also find the worker process increased suddently. Over time, we meet memory leak. Then I config gunicorn with max-request, then problem solved. I want to know the possible reason why i met memory leak when I start gunicorn without max_request params. The gunicorn version is 19.7.0 i used.
@benoitc Can you help give me train of thought?
@v-wiil you will probably need to investigate this on your own and open an new issue if you think there's a leak in Gunicorn.
Most helpful comment
just set
max-requests
and have the worker process be cleaned up after a while, that should cap whatever memory leak you have