Ruby version: 2.4.0
Sidekiq / Pro / Enterprise version(s): v4.2.8
We experience seemlingly random Redis::ConnectionError: Connection lost (ECONNRESET) problems since upgrading Sidekiq from 4.2.7 to 4.2.8.
We're hosted on heroku using heroku-redis. Apart from Sidekiq, we don't haven't had any connection problems. I suspect it could have something to do with commit https://github.com/mperham/sidekiq/commit/2749c12b887815800b0a5c93bcf077c3dcd2796d which fixed https://github.com/mperham/sidekiq/issues/3303.
Backtrace: https://gist.github.com/trautwein/a390e1b4f7bff7d979bccad408fb3aca
Since downgrading Sidekiq to 4.2.7 the problem didn't occur, anymore.
Would be great to know if anyone else has the same issue.
Hi trautwein,
We have the same issue with ruby 2.4.0 and Sidekiq 4.2.8. Thanks for letting us know downgrading Sidekiq helps. I hope this issue can be resolved.
Same here!
Ruby version: 2.3.1
Rails version: 4.2.7.1 (using Sidekiq through ActiveJob)
I'm gonna try downgrading. Thanks for the tip!
Just the same here
ruby 2.2.4p230 (2015-12-16 revision 53155) [x86_64-linux]
Rails 4.2.7.1
The fix is to add this pair to your client and server Redis config:
config.redis = { reconnect_attempts: 1 }
This effectively restores the pre-4.2.8 behavior.
Anyone know why the network is so flaky? Sidekiq assumes a 24/7 connection to Redis. Could a firewall be closing quiet TCP connections?
@mperham
Thanks! Setting reconnect_attempts fixed it for us just as you expected.
As for the connection closing, I found the following in the Heroku blog (from July 28, 2016):
Set an Appropriate Connection Timeout
By default, Redis will never close idle connections, which means that if you don't close your Redis connections explicitly, you will lock yourself out of your instance.
To ensure this doesn't happen, Heroku Redis sets a default connection timeout of 300 seconds. This timeout doesn鈥檛 apply to non-publish/subscribe clients, and other blocking operations.
Ensuring that your clients close connections properly, and that your timeout value is appropriate for your application will mean you never run out of connections.
Likely related. We were receiving the following exception (using Passenger): Redis::InheritedError: Tried to use a connection from a child process without reconnecting. You need to reconnect to Redis after forking or set :inherit_socket to true
Just to confirm, I also received Redis::InheritedError on 4.2.8 with Puma. Reverting to 4.2.7 stopped the exceptions from occurring. I did not try using reconnect_attempts: 1.
I don't know what the right answer is. This change fixed an edge case that can lead to duplicate jobs. Should we auto-handle networking issues and chance getting duplicate jobs?
@trautwein for the Heroku environment do we just set
config.redis = { reconnect_attempts: 1 }
in production.rb?
@Deekor The problem is fixed in 4.2.9. Don't configure anything, just upgrade.
@mperham By the way, thanks for reverting/fixing this!
Most helpful comment
@Deekor The problem is fixed in 4.2.9. Don't configure anything, just upgrade.