Hi
I have a very weird problem - we are processing thousands jobs per hour. The main issue is that the sidekiq jobs are getting slower due the time. It starts from 10 seconds and without the restart (for example every hour) the job time could last for example 2 hours.
Have anyone discovered similar issue?
We've got our custom servers so there is also a possiblity that the issue related to the wrong server config
One other person mentioned this recently but I don't have any detail. You'll need to profile the system to understand what part is slow. Are you running out of memory and swapping?
Memory and processor usage is not changing it looks the same
Hello, I recently experienced the same issue. Jobs that usually take 1s to complete, suddenly slowed down to ~2700-3000s to complete. I put my detail on S.O question: http://stackoverflow.com/questions/36331456/sidekiq-workers-suddenly-slowed-down-almost-like-stuck-workers, I'll post it again here.
Here's my setup:
Symptons:
Here is the TTIN log. It seems like the process hung when:
But I'm not sure why it is happening. I searched around the net and found a similar discussion here: https://groups.google.com/forum/#!topic/sidekiq/_eFQGtAWm6E, however I'm unable to understand the cause of the issue.
Any idea why this is happening? And also thanks in advance for any help.
So we tracked down the issue and found that it was caused by our DB connection configuration.
We are using a load-balanced DB server like this Sidekiq workers => Load Balancer => DB clusters. Changing the config and make the workers to communicate directly with the DB (bypassing the load balancer) solve the issue.
We'll look deeper on why it is acting up if the DB connection is done via the load balancer, but this is not a sidekiq issue.
We had the same problem with postgres connections where I work. The Postgres client library has configurable keepalive settings. Keeping a bit of traffic on the socket prevents the load balancer from dropping idle connections. Perhaps MySQL has a similar option.
We put something like this in database.yml:
production:
...
variables:
tcp_keepalives_idle: 60
tcp_keepalives_interval: 1
tcp_keepalives_count: 3
Most helpful comment
We had the same problem with postgres connections where I work. The Postgres client library has configurable keepalive settings. Keeping a bit of traffic on the socket prevents the load balancer from dropping idle connections. Perhaps MySQL has a similar option.
We put something like this in database.yml: