While receiving events using Event Processor Host, from time to time, I'm getting partition receiver exceptions:
System.InvalidOperationException: Can't create session when the connection is closing. at Microsoft.Azure.Amqp.AmqpConnection.AddSession(AmqpSession session, Nullable1 channel) at Microsoft.Azure.Amqp.AmqpCbsLink.OpenCbsRequestResponseLinkAsyncResult.GetAsyncSteps()+MoveNext() --- End of stack trace from previous location where exception was thrown --- at Microsoft.Azure.Amqp.AsyncResult.EndTAsyncResult at Microsoft.Azure.Amqp.AmqpCbsLink.EndCreateCbsLink(IAsyncResult result) at System.Threading.Tasks.TaskFactory1.FromAsyncCoreLogic(IAsyncResult iar, Func2 endFunction, Action1 endAction, Task1 promise, Boolean requiresSynchronization) --- End of stack trace from previous location where exception was thrown --- at Microsoft.Azure.Amqp.FaultTolerantAmqpObject1.OnCreateAsync(TimeSpan timeout) at Microsoft.Azure.Amqp.Singleton1.GetOrCreateAsync(TimeSpan timeout) at Microsoft.Azure.Amqp.Singleton1.GetOrCreateAsync(TimeSpan timeout) at Microsoft.Azure.Amqp.TaskHelpers.EndAsyncResult(IAsyncResult asyncResult) at Microsoft.Azure.Amqp.IteratorAsyncResult1.StepCallback(IAsyncResult result) --- End of stack trace from previous location where exception was thrown --- at Microsoft.Azure.Amqp.AsyncResult.EndTAsyncResult at Microsoft.Azure.Amqp.AmqpCbsLink.<>c__DisplayClass4_0.1.FromAsyncCoreLogic(IAsyncResult iar, Func2 endFunction, Action1 endAction, Task1 promise, Boolean requiresSynchronization) --- End of stack trace from previous location where exception was thrown --- at Microsoft.Azure.EventHubs.Amqp.AmqpPartitionReceiver.CreateLinkAsync(TimeSpan timeout) at Microsoft.Azure.Amqp.FaultTolerantAmqpObject1.OnCreateAsync(TimeSpan timeout) at Microsoft.Azure.Amqp.Singleton1.GetOrCreateAsync(TimeSpan timeout) at Microsoft.Azure.Amqp.Singleton1.GetOrCreateAsync(TimeSpan timeout) at Microsoft.Azure.EventHubs.Amqp.AmqpPartitionReceiver.OnReceiveAsync(Int32 maxMessageCount, TimeSpan waitTime) at Microsoft.Azure.EventHubs.Amqp.AmqpPartitionReceiver.OnReceiveAsync(Int32 maxMessageCount, TimeSpan waitTime) at Microsoft.Azure.EventHubs.PartitionReceiver.ReceiveAsync(Int32 maxMessageCount, TimeSpan waitTime) at Microsoft.Azure.EventHubs.Amqp.AmqpPartitionReceiver.ReceivePumpAsync(CancellationToken cancellationToken, Boolean invokeWhenNoEvents)
There is an open issue related to this in azure-amqp sdk github repo https://github.com/Azure/azure-amqp/issues/140), but one of the team members is suggesting that:
This exception is expected when a session is to be created but the connection is closing. Typically the session creation is a result of an API call from the upper SDK and should be handled by the SDK as a communication error. Please report the error to the SDKs you are using so it can be handled correctly by the retry policy in the SDKs.
So sorry that I missed to notice this issue. Is it still happening? If so, how often are you seeing the failures? Can you also check your code that you are not unregistering host at some place. Receivers are only closed during unregister call.
I had the same error last night happening in multiple microservices running in k8s.
Two errors to be precise:
fail: ConfirmService[0]
Message handler encountered an exception.Exception context for troubleshooting:
- Endpoint: some-app-test.servicebus.windows.net
- Entity Path: test-confirm
- Executing Action: Receive
System.InvalidOperationException: Can't create session when the connection is closing.
at Microsoft.Azure.ServiceBus.Core.MessageReceiver.OnReceiveAsync(Int32 maxMessageCount, TimeSpan serverWaitTime)
at Microsoft.Azure.ServiceBus.Core.MessageReceiver.<>c__DisplayClass64_0.<<ReceiveAsync>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at Microsoft.Azure.ServiceBus.RetryPolicy.RunOperation(Func`1 operation, TimeSpan operationTimeout)
at Microsoft.Azure.ServiceBus.RetryPolicy.RunOperation(Func`1 operation, TimeSpan operationTimeout)
at Microsoft.Azure.ServiceBus.Core.MessageReceiver.ReceiveAsync(Int32 maxMessageCount, TimeSpan operationTimeout)
at Microsoft.Azure.ServiceBus.Core.MessageReceiver.ReceiveAsync(TimeSpan operationTimeout)
at Microsoft.Azure.ServiceBus.MessageReceivePump.<MessagePumpTaskAsync>b__11_0()
And:
fail: CacheInvalidateHostedService[0]
Message handler encountered an exception.Exception context for troubleshooting:
- Endpoint: some-app-test.servicebus.windows.net
- Entity Path: test-cache-invalidate/Subscriptions/CacheInvalidateHostedService
- Executing Action: Receive
System.ObjectDisposedException: Cannot access a disposed object.
Object name: '$cbs'.
at Microsoft.Azure.ServiceBus.Core.MessageReceiver.OnReceiveAsync(Int32 maxMessageCount, TimeSpan serverWaitTime)
at Microsoft.Azure.ServiceBus.Core.MessageReceiver.<>c__DisplayClass64_0.<<ReceiveAsync>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at Microsoft.Azure.ServiceBus.RetryPolicy.RunOperation(Func`1 operation, TimeSpan operationTimeout)
at Microsoft.Azure.ServiceBus.RetryPolicy.RunOperation(Func`1 operation, TimeSpan operationTimeout)
at Microsoft.Azure.ServiceBus.Core.MessageReceiver.ReceiveAsync(Int32 maxMessageCount, TimeSpan operationTimeout)
at Microsoft.Azure.ServiceBus.Core.MessageReceiver.ReceiveAsync(TimeSpan operationTimeout)
at Microsoft.Azure.ServiceBus.MessageReceivePump.<MessagePumpTaskAsync>b__11_0()
They happened from 5:00 AM to 7:30 AM.
Been working with the service bus for a few days, so I can't tell if this is happening all of a sudden or will happen every week for example.
This is my class: https://gist.github.com/stefankip/8ba745894018c3d0313ceae3633f8eef
So sorry that I missed to notice this issue. Is it still happening? If so, how often are you seeing the failures? Can you also check your code that you are not unregistering host at some place. Receivers are only closed during unregister call.
No I'm not unregistering a host in any place of the code.
It's happeing very rarely, I haven't seen this error since I reported this issue.
We experience the same issue from time to time.
Microsoft.Azure.EventHubs.ServiceFabricProcessor, Version=0.5.4.0, Microsoft.Azure.EventHubs.ServiceFabricProcessor.ServiceFabricProcessor+<InnerRunAsync>d__32.MoveNext - Can't create session when the connection is closing.
Microsoft.Azure.Amqp, Version=2.4.0.0, Microsoft.Azure.Amqp.AmqpConnection.AddSession - Can't create session when the connection is closing.
[{"parsedStack":[{"assembly":"Microsoft.Azure.EventHubs.ServiceFabricProcessor, Version=0.5.4.0, Culture=neutral, PublicKeyToken=7e34167dcc6d6d8c","method":"Microsoft.Azure.EventHubs.ServiceFabricProcessor.ServiceFabricProcessor+<InnerRunAsync>d__32.MoveNext","level":0,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw","level":1,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess","level":2,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification","level":3,"line":0},{"assembly":"Microsoft.Azure.EventHubs.ServiceFabricProcessor, Version=0.5.4.0, Culture=neutral, PublicKeyToken=7e34167dcc6d6d8c","method":"Microsoft.Azure.EventHubs.ServiceFabricProcessor.ServiceFabricProcessor+<RunAsync>d__31.MoveNext","level":4,"line":0}],"outerId":"0","message":"Can't create session when the connection is closing.","type":"System.InvalidOperationException","id":"49907794"}]
Any updates on this? We also get the same issue in the last days and weeks. It happens (so far I can see it in Application Insights) over night
Any updates on this? We also get the same issue in the last days and weeks. It happens (so far I can see it in Application Insights) over night
Which SDK and version are you using?
I got the same errors in k8s. Microsoft.Azure.ServiceBus 4.1.1 is used in my project. We have this in production. Please help troubleshoot.
System.InvalidOperationException: Can't create session when the connection is closing.
at Microsoft.Azure.ServiceBus.Core.MessageReceiver.OnReceiveAsync(Int32 maxMessageCount, TimeSpan serverWaitTime)
at Microsoft.Azure.ServiceBus.Core.MessageReceiver.<>c__DisplayClass64_0.<<ReceiveAsync>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at Microsoft.Azure.ServiceBus.RetryPolicy.RunOperation(Func1 operation, TimeSpan operationTimeout)
at Microsoft.Azure.ServiceBus.RetryPolicy.RunOperation(Func1 operation, TimeSpan operationTimeout)
at Microsoft.Azure.ServiceBus.Core.MessageReceiver.ReceiveAsync(Int32 maxMessageCount, TimeSpan operationTimeout)
at Microsoft.Azure.ServiceBus.Core.MessageReceiver.ReceiveAsync(TimeSpan operationTimeout)
at Microsoft.Azure.ServiceBus.MessageReceivePump.<MessagePumpTaskAsync>b__11_0().
System.ObjectDisposedException: Cannot access a disposed object.
Object name: '$cbs'.
at Microsoft.Azure.ServiceBus.Core.MessageReceiver.OnReceiveAsync(Int32 maxMessageCount, TimeSpan serverWaitTime)
at Microsoft.Azure.ServiceBus.Core.MessageReceiver.<>c__DisplayClass64_0.<<ReceiveAsync>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at Microsoft.Azure.ServiceBus.RetryPolicy.RunOperation(Func1 operation, TimeSpan operationTimeout)
at Microsoft.Azure.ServiceBus.RetryPolicy.RunOperation(Func1 operation, TimeSpan operationTimeout)
at Microsoft.Azure.ServiceBus.Core.MessageReceiver.ReceiveAsync(Int32 maxMessageCount, TimeSpan operationTimeout)
at Microsoft.Azure.ServiceBus.Core.MessageReceiver.ReceiveAsync(TimeSpan operationTimeout)
at Microsoft.Azure.ServiceBus.MessageReceivePump.<MessagePumpTaskAsync>b__11_0().
It seems like they happened every 18 minutes.
There are 2 cases where you can observe this failure.
I will talk to the AMQP layer devs to distinguish between those two cases so we can tell which one is actully causing the error.
In the meantime, can you guys make sure option 1 isn't your case? In other words, is client closed while there are runtime operations pending?
Thank you @serkantkaraca for the feedback. I am sure option 1 is not my case.
We observed the same exception on multiple IoTHub connections between 2020-02-20T01:36:54 UTC and 2020-02-20T01:56:55 UTC all instances of IoT Hub run in the EU West region.
Multiple but not all partitions of the same IoT Hub are affected.
There was no shutdown request for the services at that time, even so i would expect that if we handle the cancellation token in the EventProcessor class, that the SDK should gracefully handle option 1 mentioned by @serkantkaraca. We only receive events on this way and do not use any other receive or send mechanism in parallel.
Following package versions are used:
Microsoft.Azure.EventHubs.ServiceFabricProcessor 0.5.4
which uses
Microsoft.Azure.EventHubs 4.1.0
were we think the error should be handled.
Exception Stacktrace:
System.InvalidOperationException: Can't create session when the connection is closing.
at Microsoft.Azure.Amqp.AmqpConnection.AddSession (Microsoft.Azure.Amqp, Version=2.4.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35)
at Microsoft.Azure.Amqp.AmqpCbsLink+OpenCbsRequestResponseLinkAsyncResult+<GetAsyncSteps>d__7.MoveNext (Microsoft.Azure.Amqp, Version=2.4.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Microsoft.Azure.Amqp.AsyncResult.End (Microsoft.Azure.Amqp, Version=2.4.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35)
at Microsoft.Azure.Amqp.AmqpCbsLink.EndCreateCbsLink (Microsoft.Azure.Amqp, Version=2.4.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35)
at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Microsoft.Azure.Amqp.FaultTolerantAmqpObject`1+<OnCreateAsync>d__6.MoveNext (Microsoft.Azure.Amqp, Version=2.4.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Microsoft.Azure.Amqp.Singleton`1+<GetOrCreateAsync>d__13.MoveNext (Microsoft.Azure.Amqp, Version=2.4.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Microsoft.Azure.Amqp.Singleton`1+<GetOrCreateAsync>d__13.MoveNext (Microsoft.Azure.Amqp, Version=2.4.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Microsoft.Azure.Amqp.TaskHelpers.EndAsyncResult (Microsoft.Azure.Amqp, Version=2.4.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35)
at Microsoft.Azure.Amqp.IteratorAsyncResult`1.StepCallback (Microsoft.Azure.Amqp, Version=2.4.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Microsoft.Azure.Amqp.AsyncResult.End (Microsoft.Azure.Amqp, Version=2.4.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35)
at Microsoft.Azure.Amqp.AmqpCbsLink+<>c__DisplayClass4_0.<SendTokenAsync>b__1 (Microsoft.Azure.Amqp, Version=2.4.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35)
at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Microsoft.Azure.EventHubs.Amqp.AmqpPartitionReceiver+<CreateLinkAsync>d__15.MoveNext (Microsoft.Azure.EventHubs, Version=4.1.0.0, Culture=neutral, PublicKeyToken=7e34167dcc6d6d8c)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Microsoft.Azure.Amqp.FaultTolerantAmqpObject`1+<OnCreateAsync>d__6.MoveNext (Microsoft.Azure.Amqp, Version=2.4.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Microsoft.Azure.Amqp.Singleton`1+<GetOrCreateAsync>d__13.MoveNext (Microsoft.Azure.Amqp, Version=2.4.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Microsoft.Azure.Amqp.Singleton`1+<GetOrCreateAsync>d__13.MoveNext (Microsoft.Azure.Amqp, Version=2.4.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Microsoft.Azure.EventHubs.Amqp.AmqpPartitionReceiver+<OnReceiveAsync>d__13.MoveNext (Microsoft.Azure.EventHubs, Version=4.1.0.0, Culture=neutral, PublicKeyToken=7e34167dcc6d6d8c)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Microsoft.Azure.EventHubs.Amqp.AmqpPartitionReceiver+<OnReceiveAsync>d__13.MoveNext (Microsoft.Azure.EventHubs, Version=4.1.0.0, Culture=neutral, PublicKeyToken=7e34167dcc6d6d8c)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Microsoft.Azure.EventHubs.PartitionReceiver+<ReceiveAsync>d__30.MoveNext (Microsoft.Azure.EventHubs, Version=4.1.0.0, Culture=neutral, PublicKeyToken=7e34167dcc6d6d8c)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Microsoft.Azure.EventHubs.Amqp.AmqpPartitionReceiver+<ReceivePumpAsync>d__18.MoveNext (Microsoft.Azure.EventHubs, Version=4.1.0.0, Culture=neutral, PublicKeyToken=7e34167dcc6d6d8c)
I hope this information helps you solve the issue. We only observed the issue once but it blocked all our production instances as we got no more events from our sensors.
I will improve exception contract to distinguish between case 1 from case 2 as the first step.
any update on this?
The fix didn't make into 4.2.0.
4.2.1 will include it and will be released 3-4 weeks later.
Same problem here in NorthEurope.
Can't create session when the connection is closing at Microsoft.Azure.ServiceBus.Core.MessageReceiver.<OnReceiveAsync>d__86.MoveNext()
on 13th April between 22h15 and 22h50 CET
on 23rd April between 05h20 and 07h25 CET
Microsoft.Azure.ServiceBus 3.3.0
Microsoft.Azure.Management.ServiceBus 2.0.1
Still seeing this from time to time any idea ?
I'm seeing a callstack similar to the original description of this issue.
What seems like is happening in our case is we get a timeout exception which is not unusual since we have some aggressive timeouts, but then the next call on the partition receiver will fail with this error. My hunch is that as the result of the timeout exception, there is some cleanup happening, but it doesn't complete before the exception surfaces to the caller. When the caller then retries, the invalid operation exception is thrown.
MSFT FTE here. Ping me and I can provide more details. I have a fairly consistent repro in our service.
Here is the abridged version of what I see from my logging breakpoints.
We call ReceiveAsync for Partition ID 20 with PartitionReceiver120 but we don't see this call return return. (there would be a log line after the one below saying the call completed).
Receiving messages... ClientId: "PartitionReceiver120(***,$Default,20)" hash: 403058 ThreadId: 26004
Exception is thrown for Partition ID 20 on thread ID 17120
System.TimeoutException: The operation did not complete within the allocated time 00:00:02.7197845 for object receiver691.
at Microsoft.Azure.Amqp.AsyncResult.End[TAsyncResult](IAsyncResult result)
at Microsoft.Azure.Amqp.AmqpObject.OpenAsyncResult.End(IAsyncResult result)
at Microsoft.Azure.Amqp.AmqpObject.EndOpen(IAsyncResult result)
at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)
--- End of stack trace from previous location where exception was thrown ---
at Microsoft.Azure.EventHubs.Amqp.AmqpPartitionReceiver.CreateLinkAsync(TimeSpan timeout)
at Microsoft.Azure.Amqp.FaultTolerantAmqpObject`1.OnCreateAsync(TimeSpan timeout)
at Microsoft.Azure.Amqp.Singleton`1.GetOrCreateAsync(TimeSpan timeout)
at Microsoft.Azure.Amqp.Singleton`1.GetOrCreateAsync(TimeSpan timeout)
at Microsoft.Azure.EventHubs.Amqp.AmqpPartitionReceiver.OnReceiveAsync(Int32 maxMessageCount, TimeSpan waitTime)
at Microsoft.Azure.EventHubs.Amqp.AmqpPartitionReceiver.OnReceiveAsync(Int32 maxMessageCount, TimeSpan waitTime)
at Microsoft.Azure.EventHubs.PartitionReceiver.ReceiveAsync(Int32 maxMessageCount, TimeSpan waitTime)
The next time we call into PartitionReceiver120 ReceiveAsync, we get the InvalidOperationException for PartitionReceiver120.
Receiving messages... ClientId: "PartitionReceiver120(***,$Default,20)" hash: 403058 ThreadId: 21748
System.InvalidOperationException: Can't create session when the connection is closing.
at Microsoft.Azure.Amqp.AmqpConnection.AddSession(AmqpSession session, Nullable`1 channel)
at Microsoft.Azure.Amqp.AmqpConnection.CreateSession(AmqpSessionSettings sessionSettings)
at Microsoft.Azure.EventHubs.Amqp.AmqpPartitionReceiver.CreateLinkAsync(TimeSpan timeout)
at Microsoft.Azure.Amqp.FaultTolerantAmqpObject`1.OnCreateAsync(TimeSpan timeout)
at Microsoft.Azure.Amqp.Singleton`1.GetOrCreateAsync(TimeSpan timeout)
at Microsoft.Azure.Amqp.Singleton`1.GetOrCreateAsync(TimeSpan timeout)
at Microsoft.Azure.EventHubs.Amqp.AmqpPartitionReceiver.OnReceiveAsync(Int32 maxMessageCount, TimeSpan waitTime)
at Microsoft.Azure.EventHubs.Amqp.AmqpPartitionReceiver.OnReceiveAsync(Int32 maxMessageCount, TimeSpan waitTime)
at Microsoft.Azure.EventHubs.PartitionReceiver.ReceiveAsync(Int32 maxMessageCount, TimeSpan waitTime)
We are using Microsoft.Azure.EventHubs v3.0.0.
This seems to happen consistently for any PartitionReceiver that gets a timeout exception.
Hope this helps. Again, feel free to ping me for more details.
Edit: just noticed the note about having an improved exception contract in 4.2.1 I'll see if we can upgrade and treat this as a retryable exception.
Edit2: Looks like upgrading to 4.2.0 has greatly reduced/eliminated the number of hits we are seeing when debugging locally.
@serkantkaraca mind linking the PR for the exception contract change so we can plan on treating this as a retryable exception when 4.2.1 is released. Thanks!
PR under review. Please reactivate if you still hit the issue with 4.3.0 release. https://github.com/Azure/azure-sdk-for-net/pull/14030
@serkantkaraca
I still facing this issue when our API in high load
At the begin I use Microsoft.Azure.EventHubs 3.0.0 and I updated it to latest version 4.3.0
System.InvalidOperationException. Details: Can't create session when the connection is closing.. at Microsoft.Azure.Amqp.AmqpConnection.AddSession(AmqpSession session, Nullable`1 channel)
at Microsoft.Azure.Amqp.AmqpConnection.CreateSession(AmqpSessionSettings sessionSettings)
at Microsoft.Azure.EventHubs.Amqp.AmqpEventDataSender.CreateLinkAsync(TimeSpan timeout)
at Microsoft.Azure.Amqp.FaultTolerantAmqpObject`1.OnCreateAsync(TimeSpan timeout)
at Microsoft.Azure.Amqp.Singleton`1.GetOrCreateAsync(TimeSpan timeout)
at Microsoft.Azure.Amqp.Singleton`1.GetOrCreateAsync(TimeSpan timeout)
at Microsoft.Azure.EventHubs.Amqp.AmqpEventDataSender.OnSendAsync(IEnumerable`1 eventDatas, String partitionKey)
at Microsoft.Azure.EventHubs.Amqp.AmqpEventDataSender.OnSendAsync(IEnumerable`1 eventDatas, String partitionKey)
at Microsoft.Azure.EventHubs.EventDataSender.SendAsync(IEnumerable`1 eventDatas, String partitionKey)
at Microsoft.Azure.EventHubs.EventHubClient.SendAsync(IEnumerable`1 eventDatas, String partitionKey)

Fix was to handle error cases better when client is closed. If you are still getting this failure with 4.3.0 repeatedly, then apparently underlying network channel is failing. Two questions:
Hello @serkantkaraca
EventHubsException and this exception will throw with InvalidOperationException so that event will be lost. For us, in this case, it will be a critical issue because we don't want to lose any customer data. That's why I ask should I also implement retry logic for this or should I ignore it when it happenshttps://github.com/Azure/azure-sdk-for-net/issues/15514#issuecomment-700409400
You won't lose any data as long as you retry. This should be a transient failure and should recover if retried.
Are you able to build a standalone repro like a console app that you can share? If we can reproduce the failures in a controlled manned, things will get easier to pinpoint the root cause.
As I said currently in our API we only implement retry logic for EventHubsException and not InvalidOperationException
Do you suggest us also retry when that exception InvalidOperationException occurs?
I could try to reproduce it using a console application then but of course, it will require some time to do
try
{
await _queueSender.SendAsync(eventData, partitionKey);
}
catch (EventHubsException ex)
{
if (ex.IsTransient) // currently we only implement retry for this
{
throw ex;
}
else
{
_logger.LogError($"MessagingException occured but is not transient.{ex.Message}");
return;
}
}
catch (Exception ex)
{
if (ex is TimeoutException || ex is UnauthorizedAccessException)
{
throw ex;
}
else
{
var trace = string.IsNullOrEmpty(ex.StackTrace) ? "No stack trace" : ex.StackTrace;
_logger.LogError($"Failed to send event due to {ex.GetType()}. Details: {ex.Message}. {trace}");
return;
}
}
@serkantkaraca here is our code path that enqueues events to event hub
From my perspective, it will be better if you could throw EventHubsException rather than InvalidOperationException in this case. Please add more information if I miss-understand about any things
I will find out if that is possible.
Great thanks then I will implement retry for that specific error msg as a workaround for now
Please notify me if there is any new information
@bebeo92 Any updates on the results? How did change work?
@serkantkaraca after retry everything works fine
But again this is not correct base on the MS document
https://docs.microsoft.com/en-us/azure/event-hubs/event-hubs-messaging-exceptions#exception-types

You are right, the client's behavior doesn't match the MS document because in this case InvalidOperationException is client side generated. Service side generated InvalidOperationExceptions are not retry-able. Unfortunately, this exception alone doesn't tell where exception is generated, so this is becoming a challenging issue. I still want to fix the API experience so I will send a PR to convert "cannot create a session" into a retry-able error.
Do you see errors are recovering after retry?
Yes, I can see the error recover after retry
I have sent a PR to convert this exception to retriable error.
Great thanks
Hi, I am having this issue as well, we are using Microsoft.Azure.EventHubs.Processor version 4.3.0 ,but I am only subscribing to the events.
Which version is that PR (https://github.com/Azure/azure-sdk-for-net/pull/15984#issue-503663997) will be available in? Would it affect my scenario?
Fix will ship in 4.3.1 release soon.