Azure-sdk-for-net: [BUG] Microsoft.Azure.ServiceBus - System.InvalidOperationException: Can't create session when the connection is closing

Created on 22 Jul 2020  路  15Comments  路  Source: Azure/azure-sdk-for-net

This issue is related to #9416 however I was asked to open a fresh thread.

Describe the bug

Intermittently for quite some time our Azure function instances running in AKS have been receiving the below exceptions coming through into Sentry.

We have a pod running .NET Core 2.2.8 with Functions v2 in our Production Kubernetes cluster and a different pod running .NET Core 3.1.5 with Functions v3 in our Sandbox cluster after recently upgrading and the exceptions are still being received from both pods intermittently. It seems to happen at random times, often days apart. I hoped upgrading to Function V3 might help to resolve the issue but alas it persists.

The production Functions pod references Microsoft.Azure.ServiceBus v3.4.0 and the sandbox Functions pod references Microsoft.Azure.ServiceBus v4.1.3.

The exception also seems to occur regardless of whether the function definition is for a Queue or Topic trigger.

Actual behavior (include Exception or Stack Trace)

Exception message:
Message processing error (Action=Receive, ClientId=MessageReceiver12account-events/Subscriptions/new-account-setup, EntityPath=account-events/Subscriptions/new-account-setup, Endpoint=sndbx-sb-project-au.servicebus.windows.net)

Note: It happens with lots of different service bus queues/topics, the exception message often relates to a different queue/topic each time.

Stack Trace:
System.InvalidOperationException: Can't create session when the connection is closing.
Module "Microsoft.Azure.ServiceBus.Core.MessageReceiver", in OnReceiveAsync
Module "System.Runtime.ExceptionServices.ExceptionDispatchInfo", in Throw
Module "System.Runtime.CompilerServices.TaskAwaiter", in ThrowForNonSuccess
Module "System.Runtime.CompilerServices.TaskAwaiter", in HandleNonSuccessAndDebuggerNotification
Module "Microsoft.Azure.ServiceBus.Core.MessageReceiver+<>c__DisplayClass64_0+<b__0>d", in MoveNext
Module "System.Runtime.ExceptionServices.ExceptionDispatchInfo", in Throw
Module "System.Runtime.CompilerServices.TaskAwaiter", in ThrowForNonSuccess
Module "Microsoft.Azure.ServiceBus.RetryPolicy", in RunOperation
Module "System.Runtime.ExceptionServices.ExceptionDispatchInfo", in Throw
Module "Microsoft.Azure.ServiceBus.RetryPolicy", in RunOperation
Module "System.Runtime.ExceptionServices.ExceptionDispatchInfo", in Throw
Module "System.Runtime.CompilerServices.TaskAwaiter", in ThrowForNonSuccess
Module "System.Runtime.CompilerServices.TaskAwaiter", in HandleNonSuccessAndDebuggerNotification
Module "Microsoft.Azure.ServiceBus.Core.MessageReceiver", in ReceiveAsync
Module "System.Runtime.ExceptionServices.ExceptionDispatchInfo", in Throw
Module "System.Runtime.CompilerServices.TaskAwaiter", in ThrowForNonSuccess
Module "System.Runtime.CompilerServices.TaskAwaiter", in HandleNonSuccessAndDebuggerNotification
Module "Microsoft.Azure.ServiceBus.Core.MessageReceiver", in ReceiveAsync
Module "System.Runtime.ExceptionServices.ExceptionDispatchInfo", in Throw
Module "System.Runtime.CompilerServices.TaskAwaiter", in ThrowForNonSuccess
Module "System.Runtime.CompilerServices.TaskAwaiter", in HandleNonSuccessAndDebuggerNotification
Module "Microsoft.Azure.ServiceBus.MessageReceivePump+<b__11_0>d", in MoveNext

Another interesting piece of info, is that I am also receiving this exception as well at essentially the same time:

System.ObjectDisposedException: Cannot access a disposed object.
Object name: '$cbs'.
Module "Microsoft.Azure.ServiceBus.Core.MessageReceiver", in OnReceiveAsync
Module "System.Runtime.ExceptionServices.ExceptionDispatchInfo", in Throw
Module "System.Runtime.CompilerServices.TaskAwaiter", in ThrowForNonSuccess
Module "System.Runtime.CompilerServices.TaskAwaiter", in HandleNonSuccessAndDebuggerNotification
Module "Microsoft.Azure.ServiceBus.Core.MessageReceiver+<>c__DisplayClass64_0+<b__0>d", in MoveNext
Module "System.Runtime.ExceptionServices.ExceptionDispatchInfo", in Throw
Module "System.Runtime.CompilerServices.TaskAwaiter", in ThrowForNonSuccess
Module "Microsoft.Azure.ServiceBus.RetryPolicy", in RunOperation
Module "System.Runtime.ExceptionServices.ExceptionDispatchInfo", in Throw
Module "Microsoft.Azure.ServiceBus.RetryPolicy", in RunOperation
Module "System.Runtime.ExceptionServices.ExceptionDispatchInfo", in Throw
Module "System.Runtime.CompilerServices.TaskAwaiter", in ThrowForNonSuccess
Module "System.Runtime.CompilerServices.TaskAwaiter", in HandleNonSuccessAndDebuggerNotification
Module "Microsoft.Azure.ServiceBus.Core.MessageReceiver", in ReceiveAsync
Module "System.Runtime.ExceptionServices.ExceptionDispatchInfo", in Throw
Module "System.Runtime.CompilerServices.TaskAwaiter", in ThrowForNonSuccess
Module "System.Runtime.CompilerServices.TaskAwaiter", in HandleNonSuccessAndDebuggerNotification
Module "Microsoft.Azure.ServiceBus.Core.MessageReceiver", in ReceiveAsync
Module "System.Runtime.ExceptionServices.ExceptionDispatchInfo", in Throw
Module "System.Runtime.CompilerServices.TaskAwaiter", in ThrowForNonSuccess
Module "System.Runtime.CompilerServices.TaskAwaiter", in HandleNonSuccessAndDebuggerNotification
Module "Microsoft.Azure.ServiceBus.MessageReceivePump+<b__11_0>d", in MoveNext

To Reproduce
Not too sure since it happens intermittently once the Function project is deployed. I have never encountered this exception when debugging locally.

An example of one of the Topic trigger function definitions is:

[FunctionName(nameof(AccountSetupTrigger))]
public async Task Run([ServiceBusTrigger("account-events", "new-account-setup", Connection = "ServiceBus")] AccountSetupMessage message, ILogger logger)

This is the csproj file (for the Sandbox Functions V3):

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <TargetFramework>netcoreapp3.1</TargetFramework>
    <AzureFunctionsVersion>v3</AzureFunctionsVersion>
  </PropertyGroup>

  <ItemGroup>
    <PackageReference Include="Microsoft.Azure.Functions.Extensions" Version="1.0.0" />
    <PackageReference Include="Microsoft.Azure.WebJobs.Extensions.ServiceBus" Version="4.1.2" />
    <PackageReference Include="Microsoft.NET.Sdk.Functions" Version="3.0.7" />
    <PackageReference Include="Sentry.AspNetCore" Version="2.1.4" />
    <PackageReference Include="Stripe.net" Version="34.16.0" />
    <PackageReference Include="Xero.Api.SDK.Core" Version="1.1.4" />
  </ItemGroup>

  <ItemGroup>
    <ProjectReference Include="..\Common\Common.csproj" />
    <ProjectReference Include="..\Data\Data.csproj" />
  </ItemGroup>

  <ItemGroup>
    <None Update="host.json">
      <CopyToOutputDirectory>Always</CopyToOutputDirectory>
    </None>
    <None Update="appsettings.json">
      <CopyToOutputDirectory>Always</CopyToOutputDirectory>
    </None>
    <None Update="local.settings.json">
      <CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
      <CopyToPublishDirectory>Never</CopyToPublishDirectory>
    </None>
  </ItemGroup>
</Project>

And the host.json file:

{
  "version": "2.0",
  "logging": {
    "logLevel": {
      "default": "Information"
    }
  },
  "extensions": {
    "serviceBus": {
      "messageHandlerOptions": {
        "maxConcurrentCalls": 1
      }
    }
  },
  "functions": [
    "AccountSetupTrigger",
    // Lots of other triggers here too...
  ]
}

Environment:

  • Microsoft.Azure.ServiceBus 4.13.0 and Microsoft.Azure.ServiceBus 3.4.0
  • Azure Functions V2 and V3
  • .NET Core 2.2.8 and .NET Core 3.1.5
  • AKS v1.14.8

Let me know if you require any more information and thanks in advance for your assistance.

Client Service Attention Service Bus customer-reported needs-team-attention question

All 15 comments

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @jfggdl.

@mr-davidc Thanks for reaching out. You mentioned you have two environments and they use Microsoft.Azure.ServiceBus v3.4.0 and Microsoft.Azure.ServiceBus v4.1.3 respectively. We had a fix to convert this System.InvalidOperationException into ServiceBusCommunicationException in Microsoft.Azure.ServiceBus v4.1.0, so Microsoft.Azure.ServiceBus v4.1.3 should not see this error but Microsoft.Azure.ServiceBus v3.4.0 will likely to see since it didn't have this fix. Are you sure the stack trace you shared is from Microsoft.Azure.ServiceBus v4.1.3?

Is it convenient to use Microsoft.Azure.ServiceBus v4.1.3 in both environments and test it out?

Hi @DorothySun216, I just double checked and yes I can confirm the exceptions I am currently seeing are being thrown from within v4.1.3 of Microsoft.Azure.ServiceBus in our Sandbox environment.

The most recent set of exceptions recorded by Sentry was 4 days ago on 30-07. I have attached a screenshot of the Sentry page showing the version of the ServiceBus package and a screenshot of the number of exceptions received on that particular day. I should note that each of the exceptions in the list of exceptions refers to a different Service Bus queue/topic.

In terms of using v4.1.3 in both environments, unfortunately it is not possible as of yet as we are still waiting for some other development work to be completed before it can be rolled to our Production environment.

I am happy to provide any other information which might help track the issue down.

Thanks

image

image

@DorothySun216

Just 13 hours ago, the functions project experienced more of these types of exceptions. However, as an interesting point to add, I also have another completely different functions project (running the same Functions and ServiceBus DLL versions) deployed in the same cluster but a separate Kubernetes Pod which ALSO experienced the same exceptions at that time...

Any ideas?

@mr-davidc thanks for confirming on the version. Can you share with us a snippet of your code when you run into this error and we will see if we can repro it? Are you using ReceiveAsync or RegisterMessageHandler? We could try to translate this exception into communication exception if we can repro so the retry logic of SDK will auto retry if it's communication exception.

So that's the thing @DorothySun216, I'm not actually directly using ReceiveAsync or RegisterMessageHandler explicitly anywhere in my code. It's only in the function definition where I reference anything service bus related in the ServiceBusTrigger like so:

Queue trigger example:
public async Task Run([ServiceBusTrigger("update-account-balances", Connection = "ServiceBus")] OrganisationMessage message, ILogger logger)

Topic trigger example:
public async Task Run([ServiceBusTrigger("account-events", "new-account-setup", Connection = "ServiceBus")] AccountSetupMessage message, ILogger logger)

The exceptions appears to be coming from the functions runtime itself from the Host.Executor ?

@mr-davidc Thanks for the info. I will reach out to the Azure Functions team to see how they are calling our API internally since we don't have access to that code. Are you blocked on this issue?

@DorothySun216 No, not blocked as such but ideally I would like to get to a point where these exceptions no longer occur as they are essentially false positives at the moment.

Looking forward to seeing what you come back with after discussing with the Azure Functions team.

Thanks for your help.

We are also experiencing the same exceptions (System.InvalidOperationException and System.ObjectDisposedException) in our Kubernetes cluster in AKS running .net Core 3.1. (Stack traces below)

It has started occurring from May this year and happens intermittently. We are using Microsoft.Azure.ServiceBus 4.1.1

{"assembly":"Microsoft.Azure.ServiceBus, Version=4.1.1.0, Culture=neutral, PublicKeyToken=7e34167dcc6d6d8c","method":"Microsoft.Azure.ServiceBus.Core.MessageReceiver+<OnReceiveAsync>d__86.MoveNext","level":0,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw","level":1,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification","level":2,"line":0},{"assembly":"Microsoft.Azure.ServiceBus, Version=4.1.1.0, Culture=neutral, PublicKeyToken=7e34167dcc6d6d8c","method":"Microsoft.Azure.ServiceBus.Core.MessageReceiver+<>c__DisplayClass64_0+<<ReceiveAsync>b__0>d.MoveNext","level":3,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw","level":4,"line":0},{"assembly":"Microsoft.Azure.ServiceBus, Version=4.1.1.0, Culture=neutral, PublicKeyToken=7e34167dcc6d6d8c","method":"Microsoft.Azure.ServiceBus.RetryPolicy+<RunOperation>d__19.MoveNext","level":5,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw","level":6,"line":0},{"assembly":"Microsoft.Azure.ServiceBus, Version=4.1.1.0, Culture=neutral, PublicKeyToken=7e34167dcc6d6d8c","method":"Microsoft.Azure.ServiceBus.RetryPolicy+<RunOperation>d__19.MoveNext","level":7,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw","level":8,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification","level":9,"line":0},{"assembly":"Microsoft.Azure.ServiceBus, Version=4.1.1.0, Culture=neutral, PublicKeyToken=7e34167dcc6d6d8c","method":"Microsoft.Azure.ServiceBus.Core.MessageReceiver+<ReceiveAsync>d__64.MoveNext","level":10,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw","level":11,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification","level":12,"line":0},{"assembly":"Microsoft.Azure.ServiceBus, Version=4.1.1.0, Culture=neutral, PublicKeyToken=7e34167dcc6d6d8c","method":"Microsoft.Azure.ServiceBus.Core.MessageReceiver+<ReceiveAsync>d__62.MoveNext","level":13,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw","level":14,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification","level":15,"line":0}

and
{"assembly":"Microsoft.Azure.ServiceBus, Version=4.1.1.0, Culture=neutral, PublicKeyToken=7e34167dcc6d6d8c","method":"Microsoft.Azure.ServiceBus.Core.MessageReceiver+<OnReceiveAsync>d__86.MoveNext","level":0,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw","level":1,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification","level":2,"line":0},{"assembly":"Microsoft.Azure.ServiceBus, Version=4.1.1.0, Culture=neutral, PublicKeyToken=7e34167dcc6d6d8c","method":"Microsoft.Azure.ServiceBus.Core.MessageReceiver+<>c__DisplayClass64_0+<<ReceiveAsync>b__0>d.MoveNext","level":3,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw","level":4,"line":0},{"assembly":"Microsoft.Azure.ServiceBus, Version=4.1.1.0, Culture=neutral, PublicKeyToken=7e34167dcc6d6d8c","method":"Microsoft.Azure.ServiceBus.RetryPolicy+<RunOperation>d__19.MoveNext","level":5,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw","level":6,"line":0},{"assembly":"Microsoft.Azure.ServiceBus, Version=4.1.1.0, Culture=neutral, PublicKeyToken=7e34167dcc6d6d8c","method":"Microsoft.Azure.ServiceBus.RetryPolicy+<RunOperation>d__19.MoveNext","level":7,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw","level":8,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification","level":9,"line":0},{"assembly":"Microsoft.Azure.ServiceBus, Version=4.1.1.0, Culture=neutral, PublicKeyToken=7e34167dcc6d6d8c","method":"Microsoft.Azure.ServiceBus.Core.MessageReceiver+<ReceiveAsync>d__64.MoveNext","level":10,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw","level":11,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification","level":12,"line":0},{"assembly":"Microsoft.Azure.ServiceBus, Version=4.1.1.0, Culture=neutral, PublicKeyToken=7e34167dcc6d6d8c","method":"Microsoft.Azure.ServiceBus.Core.MessageReceiver+<ReceiveAsync>d__62.MoveNext","level":13,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw","level":14,"line":0},{"assembly":"System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e","method":"System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification","level":15,"line":0}

@tharidu thanks so much for reporting this. We already created an work item to track this. But due to our bandwidth and the fact that this is not a blocking error, it might take some time for investigation since we need to focus on high pri issues for now. I would recommend treat this as transient error for now and add retry mechanisms to deal with this. And I will update as soon as we figure out anything for a fix. Thanks.

@tharidu Do you mind if I ask what version of Kubernetes your AKS cluster is running?

@DorothySun216 Thanks for the reply (y)
@mr-davidc We are using K8s version 1.15.10 at the moment in two clusters and experiencing these exceptions in both.

Ok thanks @tharidu. I was wondering if upgrading our cluster version might help in resolving the issue but I guess not.

hello, we have also experienced the same issue and opened an incident with microsoft. While not blocked, it does cause rework due to retries and failures to dead letter. This affects several of us across many integrations.

We have rolled out a fixed https://github.com/Azure/azure-sdk-for-net/pull/17023 on latest release 5.1.0 and can you test if with this new nuget package, are you still seeing the same issue? https://www.nuget.org/packages/Microsoft.Azure.ServiceBus/5.1.0

Was this page helpful?
0 / 5 - 0 ratings