Bull: redis script cache gets bloated after update to bull 2.0

Created on 3 Jan 2017  路  32Comments  路  Source: OptimalBits/bull

Hello,

I'm experiencing a growing Lua script cache after updating to bull-2.0
After 20 days of uptime my redis Lua cache grown up to 15GB

Anyone experiencing same issue?

Thanks

bug

Most helpful comment

Ok, I have created a custom bull-redlock package and fixed one last issue. The queue works pretty well now without leaks. I will make a new release in the following hours.

All 32 comments

can you check what entries are there in the cache? there must be a spurious entry that is for some reason not being reused but created every time it is used.

@manast I've added some console logs to the script caching function in lib/scripts.js. The hashes seem all the same.
Just to be sure that we are on the same, the value that is growing fast is used_memory_lua
Reverting to v1.1.3 is working fine.

Do you have a preferred way to analyse the redis lua script cache?

@manast can it be that _retryAtOnce is using directly eval and not caching the script and using evalsha so it is accumulating the same script again and again every time there is a retry?

eval is not caching the script, so I don't think that is the reason of the problem. I am not aware of any other command in redis besides LOAD that caches scripts. So if you have time you could put a console.log just before calling load, and you should be able to see which script is being cached continuously: https://redis.io/commands/script-load

You're right. I added a console.log in my dev environment.
It looks everything normal but just a nodejs process (concurrency = 5) is bloating my redis instance.
On production I need to revert to 1.1.3. The main change seems moving to ioredis and that looks good. I will try to spend a bit of time on it in next days, since it's crucial for me to maintain bull as a dependency.

Thanks for helping

The major new thing in 2.0 is the redlock algo, I would bet that is the root of the problem. @doublerebel any ideas?

I think you're right @manast @doublerebel
Is there an easy way to test the previous locking system?
Otherwise will set up a branch for it

when the old locking system was replaced by redlock many other changes where performed, so it is not easy to replace it. I will investigate when I have some time why the script cache is increasing, there could be an issue in the redlock dependency as well: https://github.com/mike-marcacci/node-redlock

I have analised the code, and I cannot see anything in redlock or in bull that would currently cache a lot of scripts. In redlock it is actually running eval directly (which is not so performant, I would like to improve this in the future), and in bull, the code related to lock is not using any scripts, it is relayed to redlock. If you put a console log here https://github.com/OptimalBits/bull/blob/master/lib/scripts.js#L46 outputing the hashes you should only see a few ones (correct behaviour) otherwise you should see a lot of them.
Now, is this cache bloating creating problems? it could be some redis optimization that caches lua stuff but that is reused by actual keys if necessary, and therefore nothing to worry about?

From redis documentation https://redis.io/commands/eval

"The EVAL command forces you to send the script body again and again. Redis does not need to recompile the script every time as it uses an internal caching mechanism, however paying the cost of the additional bandwidth may not be optimal in many contexts.hanism, however paying the cost of the additional bandwidth may not be optimal in many contexts."

@manast well my redis db that is around 1GB reached 15GB of memory, that's a pretty big overhead in my opinion and it is a risk on a production environment since there is no clear limit for it.

I think I found the issue:
https://github.com/OptimalBits/bull/blob/f1ea81c0e9a950f47d5aa39a2bc5ff82cb573355/lib/scripts.js#L328
Here you call redlock passing a variable locking Lua script.
This is probably what's bloating the redis cache

without checking redlock, I guess that line is there because it is not currently possible to specify extra keys to be used by redlock. That would certainly need a change in redlock as well, to be merged and released before we can fix it in bull...

Sounds good. At least it seems the explanation of what I'm seeing. I'll see if I can prepare something for graylog.
I think that any real use of bull-2.0 in production will end up in causing troubles. I'm handling around 1 million jobs per day and with the current cache grow I reach 15GB in 3 days.

Thanks for this awesome lib btw, I truly like it

Even if the 15Gb is scary, it may not be a problem if redis just reuse that memory when needed. I checked redis.conf file but unfortunately there is no setting to limit the max amount of lua cache. Maybe @antirez could give some explanation about this behaviour.

I agree that it would not be scary if there was a recycle of the memory. I checked for that parameter as well.
From here https://redis.io/commands/eval
The reason why scripts can be cached for long time is that it is unlikely for a well written application to have enough different scripts to cause memory problems. Every script is conceptually like the implementation of a new command, and even a large application will likely have just a few hundred of them. Even if the application is modified many times and scripts will change, the memory used is negligible.

I have the feeling that the changing the script each time is against redis design.

We are seeing this as well, our used_memory_lua was at 1118090240. Our backend locks up and once we run script flush things start working again.

I think we have fixed this on our pull request.

The issue is that line https://github.com/OptimalBits/bull/blob/f1ea81c0e9a950f47d5aa39a2bc5ff82cb573355/lib/scripts.js#L328 was creating a new script for every single job ID. 10,000 jobs, results in 10,000 scripts in the script cache and hence this issue.

We experience this issue as well with about 600k jobs a day. We use Redis 3.0.7 on Ubuntu 14.04

After 7 days of use Redis memory is arounf 8GB - data is less than 1GB, rest is reported as Lua scripts cache.

Any update on this issue?
Thanks

we have a PR that solves this issue but there is another issue with that PR that needs to be resolved before it can be merged.

That's how the memory on my Redis instance (that supports Bull) looks like. We are running about 1 million messages a day through the Bull queues. The server has 16 GB of memory, and looses about 3GB a day to Lua cache.
Running SCRIPT FLUSH command any other day helps, but is a hustle. Other than that, Bull works just great for us.
Hope this issue will be fixed soon.

image

@jeltok a fix for this issue is the next thing in my list.

@jeltok can you please verify that the issue is resolved with bull 2.2.0 so that we can close it?

Looks like I'm still seeing the issue. Upgraded from 1.1.3 to 2.2.0 today. Have kept an eye on the used_memory_lua field and it's been steadily increasing since the upgrade. Flushing the scripts from the redis instance drops the usage down to normal.

redis

hmm, ok, I will need to dig more into it.

Thanks! Just switched from kue, and other than this it's been working great! Let me know if you need anything :smile:

I have installed version 2.2.0 yesterday. Unfortunately it does not improve the issue with Lua cache by much.

image

@jeltok I would say nothing at all :). I will try to do a deeper analysis.

@manast @jeltok I have some time today will try to look into it further as well.

Joe

Ok, I have created a custom bull-redlock package and fixed one last issue. The queue works pretty well now without leaks. I will make a new release in the following hours.

Thanks a lot!!! Looks much better now :-)

image

Sorry for the noise, just wanted to send some gratitude to @manast for fixing this issue.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

davedbase picture davedbase  路  3Comments

NicolasDuran picture NicolasDuran  路  4Comments

pigaov10 picture pigaov10  路  3Comments

tdzienniak picture tdzienniak  路  4Comments

PhillippOhlandt picture PhillippOhlandt  路  4Comments