We are seeing the RedisConnectionException (complete stack trace listed below) on a couple of production servers over the last few weeks. When the error happens, the only option that works is IISReset. After IISReset is performed, things go back to normal. I wanted to check if this issue has been fixed with the latest version 2.1.58.
Specifically, is the Reconnect issue fixed, so IIS Reset is not required?
Your insights are appreciated.
PS: Issue # 1120 talks about the RedisConnectionException for v2.0.601
StackExchange.Redis.RedisConnectionException: No connection is available to service this operation: SETEX DSAProductDC:g4_cfodcdaissecret_5.59.448.0; SocketClosed (ReadEndOfStream, last-recv: 0) on redis-10009.rdsud-np.us.dell.com:10009/Subscription, Idle/MarkProcessed, last: PING, origin: ReadFromPipe, outstanding: 0, last-read: 0s ago, last-write: 1s ago, keep-alive: 10s, state: ConnectedEstablished, mgr: 9 of 10 available, in: 0, in-pipe: 0, out-pipe: 0, last-heartbeat: 0s ago, last-mbeat: 0s ago, global: 0s ago, v: 2.0.601.3402; IOCP: (Busy=0,Free=600,Min=480,Max=600), WORKER: (Busy=21,Free=579,Min=480,Max=600), Local-CPU: n/a ---> StackExchange.Redis.RedisConnectionException: SocketClosed (ReadEndOfStream, last-recv: 0) on redis-10009.rdsud-np.us.dell.com:10009/Subscription, Idle/MarkProcessed, last: PING, origin: ReadFromPipe, outstanding: 0, last-read: 0s ago, last-write: 1s ago, keep-alive: 10s, state: ConnectedEstablished, mgr: 9 of 10 available, in: 0, in-pipe: 0, out-pipe: 0, last-heartbeat: 0s ago, last-mbeat: 0s ago, global: 0s ago, v: 2.0.601.3402 --- End of inner exception stack trace --- at StackExchange.Redis.ConnectionMultiplexer.ExecuteSyncImplT at
Other related thread:
https://github.com/StackExchange/StackExchange.Redis/issues/1120
+100
does anyone know a workaround?
@kulku I am curious if you are running into this issue while accessing a clustered cache?
@kulku I am curious if you are running into this issue while accessing a clustered cache?
Yes, that's correct, @deepakverma
@kulku then you might be running into this issue https://github.com/StackExchange/StackExchange.Redis/issues/1501
Thank you, @deepakverma
Looks like it. Hoping for the decision from @mgravell, to take this forward.
@mgravell, hope you got a chance to review the suggestions on this thread. Did you make a decision yet?
We are seeing this issue in production again, this time around on multiple boxes. (Typically, it's just a single box)
Did a bit more investigation. My observation is that the "StackExchange.Redis.RedisConnectionException: No connection is available to service this operation" exception did not happen in the Nuget version 1.2.3 and Nuget version 1.1.603.
We performed a major upgrade to Nuget version 2.0.601 in June. That's when we started seeing this behavior in production.
I thought I will put it here, if it helps take the resolution move forward.
I am sure, Nuget 2.x was brought in a with a bunch of much sought after enhancements.
I hope this helps a bit.
We are having the same issue on our production cluster running on kubernetes and .net core 3.1 with Azure Redis Service and it can run for weeks but suddenly this issue happens. The only way to solve it is to kill the pod. A quick workaround for now is that we catch the exception and forcing the app to restart to "heal" itself. Please @mgravell do you have some advice how this can be fixed in a better way.
We are also having the same issue with the same setup as andtii (.net core 3.1 with Azure Redis Service running in Kubernetes)
We were seeing the same issue using Azure Redis Service, but only from specific machines. Consistently. Other machines would work correctly. We couldn't identify any differences between those machines w.r.t. configuration or setup.
On Azure App Service we sometimes hit the SNAT port limitations. Not with Redis but e.g. outgoing http or SQL connections. I think, didn't have a chance to fully investigate, that the load balancer in front of the App Service refuses communication when these snat limits are reached.
We are having the same issue on our production cluster running on kubernetes and .net core 3.1 with Azure Redis Service and it can run for weeks but suddenly this issue happens. The only way to solve it is to kill the pod. A quick workaround for now is that we catch the exception and forcing the app to restart to "heal" itself. Please @mgravell do you have some advice how this can be fixed in a better way.
Did you find a solution to this issue?
We have exactly the same problem on k8s
@mgravell , our team facing this problems over 4 months, although the solution been provide on July but we still waiting the new version release and inform other 3rd party like Hangfire to update their dependency to support Redis Cluster running correctly on Production. It's a very long waiting for us to get survive on Production oncall hell...
Could you please consider publish the new version and give us a chance from ourselves decision.
Greetings,
We're seeing this exact same error,
including Redis Timeouts of more than 100seconds on Azure Redis C1 instance with barely 50-60 connections on average.
Connection Multiplexer seems to be just going away even with abortConnect = false and allowAdmin = true
Also Memory Usage starts to Keep Climbing Above 1-2 GBs on an app service that normally averages around 250-300 MBs
There's a really major issue with the Connectivity Part here.
Looks like it happens if there is a connection drop/blib in Azure / Connectitivity towards redis.
For now tried implementing the recommended Force Reconnect approach.
Our async and sync timeout is set to 30 seconds just in case and we're not storing any objects larger than 1K in redis.
If anyone has any idea what's going on here please share, this is causing major issues in our production atm.
We encouter this error too, for several month now. Do not know what happend or was changed before the error happend the first time. We used Azure Redis Cache before and switched to a Linux App Service with Redis. That was no solution, exception still occurs.
It seems to happen when there is short transient connectivity issues and it seems the Azure Redis or Azure Network has this pretty often. Maybe StackOverflow have their own Redis Servers/Network and thats why they dont get the issue making them prioritize this. This weekend we got alerts that it happened about 500 times over a couple of hours causing our apps to restart everytime to recover
I can confirm that in that in the last days the exceptions happen regularly on azure network hosted redis and azure app service with linux and redis. And the apps will restart when it happens, yes. I optimzed our code so I think the app will not restart but redis will still be not available from time to time. This is annoying cause we are also using Microsoft.Web.Redis.RedisSessionStateProvider which is based on StackExchange.Redis.
The exception we receive is:
No connection is active/available to service this operation: GET ServiceProvider:GetDefaultServiceProviderAsync_e5b8f26cbd5591917e218353e665563d069d531cf31ec9fb00a7ed1baeeffc60; A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond, mc: 1/1/0, mgr: 10 of 10 available, clientName: CM_WebApp, IOCP: (Busy=0,Free=1000,Min=2,Max=1000), WORKER: (Busy=12,Free=32755,Min=2,Max=32767), v: 2.1.58.34321
I can access the redis without any problems at everytime and there is no slow log entry or any other thing that would describe a problem with Redis server itself.
I was having this issue with v 2.1.58 and can confirm that it is not related to any configuration setup ( tried every means to modify SSL/connectTimeout/abortConnect ....). After I upgrade my EC2 instance from t3.medium to m5.large, and the issue is gone (I've no idea why)
Currently I suspect available memory for local web app. If there is not enough free (defragmented) memory available while receiving data from Redis storage the error will happen.
It is only my guess, cause we have seen this issues and other memory related issues shortly before app service is restarted caused by high memory consumption. Unfortunatly the redis exceptions suggest other problems.
@icetea7 you upgraded memory from 4 to 8 GB so that would fit.
I've just stumbled on same issue while testing Sentinel connection mode in a 3 node replication setup.
And can reliably reproduce it by triggering a master fail-over. The multiplexer seems to not realize, or at least fast enough, that there is a new master and blows up without retrying.
Exception:
StackExchange.Redis.RedisConnectionException : No connection is active/available to service this operation: SET test-connect; SocketClosed (ReadEndOfStream, last-recv: 0) on 10.51.128.142:6379/Subscription, Idle/MarkProcessed, last: SUBSCRIBE, origin: ReadFromPipe, outstanding: 1, last-read: 0s ago, last-write: 0s ago, keep-alive: 60s, state: ConnectedEstablished, mgr: 1 of 10 available, in: 0, in-pipe: 0, out-pipe: 0, last-heartbeat: 0s ago, last-mbeat: 0s ago, global: 0s ago, v: 2.2.4.27433, mc: 1/1/0, mgr: 5 of 10 available, clientName: EUNBCKW6L13, IOCP: (Busy=2,Free=998,Min=16,Max=1000), WORKER: (Busy=16,Free=32751,Min=16,Max=32767), v: 2.2.4.27433
Data:
Redis-Multiplexer-Connects: 1/1/0
Redis-Manager: 5 of 10 available
Redis-Client-Name: EUNBCKW6L13
Redis-ThreadPool-IO-Completion: (Busy=2,Free=998,Min=16,Max=1000)
Redis-ThreadPool-Workers: (Busy=16,Free=32751,Min=16,Max=32767)
Redis-Busy-Workers: 16
Redis-Version: 2.2.4.27433
redis-command: SET test-connect
request-sent-status: WaitingToBeSent
----> StackExchange.Redis.RedisConnectionException : SocketClosed (ReadEndOfStream, last-recv: 0) on 10.51.128.142:6379/Subscription, Idle/MarkProcessed, last: SUBSCRIBE, origin: ReadFromPipe, outstanding: 1, last-read: 0s ago, last-write: 0s ago, keep-alive: 60s, state: ConnectedEstablished, mgr: 1 of 10 available, in: 0, in-pipe: 0, out-pipe: 0, last-heartbeat: 0s ago, last-mbeat: 0s ago, global: 0s ago, v: 2.2.4.27433
Data:
Redis-FailureType: SocketClosed
Redis-EndPoint: 10.51.128.142:6379
Redis-Origin: ReadFromPipe
Redis-Outstanding-Responses: 1
Redis-Last-Read: 0s ago
Redis-Last-Write: 0s ago
Redis-Keep-Alive: 60s
Redis-Previous-Physical-State: ConnectedEstablished
Redis-Manager: 1 of 10 available
Redis-Inbound-Bytes: 0
Redis-Inbound-Pipe-Bytes: 0
Redis-Outbound-Pipe-Bytes: 0
Redis-Last-Heartbeat: 0s ago
Redis-Last-Multiplexer-Heartbeat: 0s ago
Redis-Last-Global-Heartbeat: 0s ago
Redis-Version: 2.2.4.27433
at StackExchange.Redis.ConnectionMultiplexer.ExecuteSyncImpl[T](Message message, ResultProcessor`1 processor, ServerEndPoint server) in /_/src/StackExchange.Redis/ConnectionMultiplexer.cs:line 2791
at StackExchange.Redis.RedisBase.ExecuteSync[T](Message message, ResultProcessor`1 processor, ServerEndPoint server) in /_/src/StackExchange.Redis/RedisBase.cs:line 54
at Caching.Redis.Tests.Integration.RedisCacheTests.Connect_WithSentinel_Works() in C:\SCM\Caching\Caching\Redis.Tests.Integration\RedisCacheTests.cs:line 163
Hope this helps narrow down on root cause.
we're also facing this issue, recently we have upgraded our .net framework app to .net core 3.1. We're using the amazon redis cache
Unhandled exception. StackExchange.Redis.RedisConnectionException: No connection is active/available to service this operation: EXISTS NotificationUsers; UnableToConnect on
stage.example.com:6379/Interactive, Initializing/NotStarted, last: NONE, origin: BeginConnectAsync, outstanding: 0, last-read: 0s ago, last-write: 0s ago, keep-alive: 60
s, state: Connecting, mgr: 10 of 10 available, last-heartbeat: never, global: 0s ago, v: 2.1.58.34321, mc: 1/1/0, mgr: 10 of 10 available, clientName: ip-172-31-32-231, IOCP
: (Busy=0,Free=1000,Min=2,Max=1000), WORKER: (Busy=1,Free=32766,Min=2,Max=32767), v: 2.1.58.34321
---> StackExchange.Redis.RedisConnectionException: UnableToConnect on stage.example.com:6379/Interactive, Initializing/NotStarted, last: NONE, origin: BeginConnectAsync
, outstanding: 0, last-read: 0s ago, last-write: 0s ago, keep-alive: 60s, state: Connecting, mgr: 10 of 10 available, last-heartbeat: never, global: 0s ago, v: 2.1.58.34321
at StackExchange.Redis.TaskExtensions.TimeoutAfter(Task task, Int32 timeoutMs) in /_/src/StackExchange.Redis/TaskExtensions.cs:line 49
at StackExchange.Redis.ConnectionMultiplexer.WaitAllIgnoreErrorsAsync(Task[] tasks, Int32 timeoutMilliseconds, LogProxy log, String caller, Int32 callerLineNumber) in /_/
src/StackExchange.Redis/ConnectionMultiplexer.cs:line 740
--- End of inner exception stack trace ---
at StackExchange.Redis.ConnectionMultiplexer.ExecuteSyncImpl[T](Message message, ResultProcessor`1 processor, ServerEndPoint server) in /_/src/StackExchange.Redis/Connect
ionMultiplexer.cs:line 2810
at StackExchange.Redis.RedisBase.ExecuteSync[T](Message message, ResultProcessor`1 processor, ServerEndPoint server) in /_/src/StackExchange.Redis/RedisBase.cs:line 54
at StackExchange.Redis.RedisDatabase.KeyExists(RedisKey key, CommandFlags flags) in /_/src/StackExchange.Redis/RedisDatabase.cs:line 668
at SP.NotificationCenter.Business.Redis.RedisCacheService.UpdateUsers(UserDto user) in D:\Visual Studio\TFS\SP.NotificationCenter\Dev\2.1.0\SP.NotificationCenter.Business
\Redis\RedisCacheService.cs:line 503
at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__139_1(Object state)
at System.Threading.QueueUserWorkItemCallback.<>c.<.cctor>b__6_0(QueueUserWorkItemCallback quwi)
at System.Threading.ExecutionContext.RunForThreadPoolUnsafe[TState](ExecutionContext executionContext, Action`1 callback, TState& state)
at System.Threading.QueueUserWorkItemCallback.Execute()
at System.Threading.ThreadPoolWorkQueue.Dispatch()
at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback()
Aborted``
Greetings,
We're seeing this exact same error,
including Redis Timeouts of more than 100seconds on Azure Redis C1 instance with barely 50-60 connections on average.Connection Multiplexer seems to be just going away even with abortConnect = false and allowAdmin = true
Also Memory Usage starts to Keep Climbing Above 1-2 GBs on an app service that normally averages around 250-300 MBs
There's a really major issue with the Connectivity Part here.
Looks like it happens if there is a connection drop/blib in Azure / Connectitivity towards redis.
For now tried implementing the recommended Force Reconnect approach.
Our async and sync timeout is set to 30 seconds just in case and we're not storing any objects larger than 1K in redis.
If anyone has any idea what's going on here please share, this is causing major issues in our production atm.
I also saw a similar anomaly about client memory usage climbing up, but in addition to that, the active connections metrics in AWS ElastiCache also seemed to be piling up. Similarly to what others are reporting in this thread, it's also a randomly happening issue here.
Can you please confirm whether you are also seeing / saw back when you posted this the number of active connections adding up on Redis' side? I'm asking to make sure that the problems I experience are more or less the same as in this thread.
Thanks.
My server has being using Redis Cache since Feb 2020, the same issue happened the first time today.
Memory Usage was low (2MB), 0% Server Load, but keep getting error No connection is available to service this operation
The issue was resolved by rebooting the Redis Cache server.
I found that if I added the below text to the Connection String, the issues stopped. I'll monitor to see if they stay away:
ssl=True,abortConnect=False,sslprotocols=tls12
Although changing the connection string might just reset the connection anyway, so the true solution is still unknown.
+100
does anyone know a workaround?
Add this to your connection string: ssl=True,abortConnect=False,sslprotocols=tls12
@ianvink how are you doing?
I'm facing the same problem, I already use your suggested connection string:
+100
does anyone know a workaround?Add this to your connection string: ssl=True,abortConnect=False,sslprotocols=tls12
But still having the issue, my app is running in a container accessing Redis in another one inside a k8s cluster.
My app looks likes that don't connect when I use your conn string! 馃様
I had to reboot all the Redis containers and the apps as well, after doing this the app is consuming properly the cache. 馃槰
Most helpful comment
Greetings,
We're seeing this exact same error,
including Redis Timeouts of more than 100seconds on Azure Redis C1 instance with barely 50-60 connections on average.
Connection Multiplexer seems to be just going away even with abortConnect = false and allowAdmin = true
Also Memory Usage starts to Keep Climbing Above 1-2 GBs on an app service that normally averages around 250-300 MBs
There's a really major issue with the Connectivity Part here.
Looks like it happens if there is a connection drop/blib in Azure / Connectitivity towards redis.
For now tried implementing the recommended Force Reconnect approach.
Our async and sync timeout is set to 30 seconds just in case and we're not storing any objects larger than 1K in redis.
If anyone has any idea what's going on here please share, this is causing major issues in our production atm.