Framework: Queue workers not clearing memory after the jobs are done

Created on 8 Oct 2018  路  12Comments  路  Source: laravel/framework

  • Laravel Version: 5.6
  • PHP Version: 7.2
  • Database Driver & Version:
  • Queue Driver: both beanstalkd and Redis

Description:

We use a lot of background processing with workers and at the moment we have 7 servers with 10 workers on each. We process more than a million jobs on a daily basis. One problem that we have noticed in the past year is that the memory is not cleared after the job finishes with processing. If we call queue: restart or horizon: terminate (we use horizon since a month ago, but the same problem was happening before horizon), then 'htop' says we use 500M/8G which is ok. After the workers run a few hours, the memory gets full and we get a lot of swap reads - which is a problem on AWS, since they charge IOPS.

I met @taylorotwell at Laracon EU and mentioned this to him. He said that he has similar issue and for now the only way to solve it is to run a cron job every hour or so that will execute queue: restart, which will kill the processes and clear the RAM.

Since we have a lot of job classes and we are not doing anything out of the ordinary, why is the RAM not cleared after the job is done like a regular PHP script? Is it a problem with garbage collector and worker processes?

Steps To Reproduce:

The issue can be reproduced by running a lot of jobs on a single server, without doing queue: restart for a while. At the moment, we use Laravel 5.6 and PHP 7.2. but I think the issue was present when we had Laravel 5.5 and PHP 7.1. I've heard other people complaining about this as well. The issue happens when using both beanstalkd and Redis, as well as with and without Horizon.

All 12 comments

I met @taylorotwell at Laracon EU and mentioned this to him. He said that he has similar issue and for now the only way to solve it is to run a cron job every hour or so that will execute queue: restart, which will kill the processes and clear the RAM.

So what's the issue with doing this? You could schedule it every 15 minutes or every 3 hours - whatever suits your requirements?

It is well known that PHP was never designed for continuous running scripts - it was fundamentally designed to fire up and then die. This would seem a natural by product of that?

The queue:work command has a --memory option that automatically shuts it down when it consumes too much memory. I assume you have supervisor or something similar that will restart any worker that is stopped.

I like to reference the article PHP is meant to die when talking about long-running processes.

We've been running Laravels Queue system over a year now and never saw any increased memory.

Granted, we're not at ~1 mio jobs per day, rather 100-200k.

Restart only happens on deployment which is frequent during the week, but maybe that's a factor nothing was ever seen.

I have experienced workers failing silently in a 5.1 project, configured with the database queue driver.

Speaking out of turn here, but could php-fpm manage the worker better? Not sure how to do that but https://github.com/queue-interop/queue-interop looks interesting.

@bagf Are you trying to bring up an issue? We would need more information about that silent fail. Are you trying to suggest a new idea for php-fpm? Use https://github.com/laravel/ideas

@sisve sadly don't have access the exact code anymore so can't provide specifics sorry. Restarting it (and clearing the memory usage) did fix the problem in all cases. I don't think volume caused the issue as it wasn't processing nearly 100 jobs a month - Didn't know about the --memory flag until now. That might have fixed it.

I'm going to close this, as there is no specific bug.

Taylor has already recommended a "best practice" solution, and without specific reproducible code, there's not much more that can be done.

@laurencei I don't know if this is something that Laravel can actually fix (maybe it is fixable). It is probably PHP issue ('PHP is meant to die') as somebody mentioned before, because the workers load the code once and never restart it, unless we run queue:restart. I can't share reproducible code (mainly because you'll need to run a lot of jobs to reproduce it), but try sending 100k emails in 10 minutes, which includes html parsing. The memory quickly fills. The 'Best practice' solution is fine, but I'll need to run queue:restart every 5 minutes in order not to have this issue and that is something I don't like.

In that case just use queue:listen on any queues that have high memory leaking.

queue:listen "restarts" after each job, and is designed for this exact situation.

@laurencei , I am facing the exact issue for past 3 months. Tried to replace queue:work with queue:listen but its still showing up. Just want to know if there is any further improvement in this regard?

@AtifJaved1 as mentioned before, this is more of a PHP problem. I've had this problem a long time ago and there are few things you can do to prevent your workers using too much memory:

  • write cleaner code and assign null to your variables once finished using them to clear the data until the garbage collector kicks in
  • you may call gc_collect_cycles() at the end of your jobs to free some memory (but please read its associated documentation as well)
  • use a supervisor for your workers, like supervisord; You can basically have multiple workers on the same queue with a memory limit per worker. Once a job has been processed, Laravel checks the memory usage and kills any worker using more memory than the specified memory limit; then, the supervisor will restart that worker (this will indirectly clean up the memory as well).

Supervisord configuration example:

[program:artisan_d1]
command=php ./artisan queue:work --daemon --tries=1 --sleep=1 --memory=128
startretries=5
autorestart=true

[program:artisan_d2]
command=php ./artisan queue:work --daemon --tries=1 --sleep=1 --memory=128
startretries=5
autorestart=true

@julian-costel Thanks for your help.
Isn't queue:listen the solution here? It supposed to terminate the application after each run no?

Was this page helpful?
0 / 5 - 0 ratings