Some methods advertised as being async are in fact not.
Our application is using MSI authentication, and uses a db connection interceptor to set the access token on connection open. when we first enabled MSI authentication or application fell over at peak due to SNAT port exhaustion.
We eventually found the cause was due to workers being blocked on opening a db connection in a synchronous call even though the expectation was it would be async. The main culprit for our case was UserManager.FindByEmailAsync
Methods used by our application that exhibit this behaviour include
If you set RequireUniqueEmail to off then some if not all of the above do not use the blocking call.
https://github.com/sbkaspersen/jubilant-sniffle
dotnet --info.NET Core SDK (reflecting any global.json):
Version: 3.1.302
Commit: 41faccf259
Runtime Environment:
OS Name: Windows
OS Version: 10.0.18362
OS Platform: Windows
RID: win10-x64
Base Path: C:\Program Files\dotnet\sdk\3.1.302\
Host (useful for support):
Version: 3.1.6
Commit: 3acd9b0cd1
.NET Core SDKs installed:
3.1.201 [C:\Program Files\dotnet\sdk]
3.1.302 [C:\Program Files\dotnet\sdk]
.NET Core runtimes installed:
Microsoft.AspNetCore.All 2.1.20 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
Microsoft.AspNetCore.App 2.1.20 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
Microsoft.AspNetCore.App 3.1.3 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
Microsoft.AspNetCore.App 3.1.6 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
Microsoft.NETCore.App 2.1.20 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 3.1.3 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 3.1.6 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.WindowsDesktop.App 3.1.3 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
Microsoft.WindowsDesktop.App 3.1.6 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
This looks like an entity framework issue as that's where you're currently plugging in the MSI auth logic.
@ajcvickers wheres the appropriate way to plug in MSI?
cc @blowdart as you were looking at MSI
@sbkaspersen Can you share which version of the MSI libraries you are using, and how you're wiring them up?
@blowdart
We are running the webapp in azure. Using Microsoft.Azure.Services.AppAuthentication 1.4.0. AzureServiceTokenProvider is injected into the DB Conneciton Interceptor (commented out code in the repro).
We have worked around the issue by bypassing calls to the mentioned methods (Instead using the db context directly)
While MSI is symptomatic of the bug, the actual bug I believe is the Task.FromResult here. At least for FindByEmail., I suspect the other ones we found are related as they are only reproducible if
RequireUniqueEmail is true.
Regardless of it being root cause, it is an error in identity. https://github.com/dotnet/aspnetcore/commit/9ebabc133795a573d3012f8f0e2746c104a9e96f
This commit changed it to sync version when EF Core query pipeline was being rewritten.
But when removing work-around it did not revert back to async version https://github.com/dotnet/aspnetcore/commit/839cf8925278018903f53f22d580d15b0a59ca0f
I see so this line of code https://github.com/sbkaspersen/jubilant-sniffle/blob/4e31cf5b9c899da525f37bc600cbb25f0a0821ad/UserManagerAsyncTest.cs#L19 is irrelevant?
@davidfowl, That was the line of code causing all our worker threads to block.
Since we were using async calls our expectation was that https://github.com/sbkaspersen/jubilant-sniffle/blob/4e31cf5b9c899da525f37bc600cbb25f0a0821ad/UserManagerAsyncTest.cs#L32 would be entered and not L19 as is the current case.
When we use our own async dbcontext calls instead of UserManager we are only entering ConnectionOpeningAsync which does not block the worker threads.
Got it, so this does look like an identity issue.
Oops yeah, missing an Async, wonder how far back this goes
All the way back to 2.1 it looks like, how far back should we patch @ajcvickers @blowdart ?
Everything supported, so that means 2.1 as it's LTS
So release/2.1, release 3/1., 5.0-rc1. I'll start the PR with master, and I guess tag this servicing
Why would we patch this given the BCL is still pushing the design that new async APIs can defer to their sync counterparts because "it doesn't really matter" if people are getting sync when they think they are getting async? We should be consistent in how we view these things. Right now in the BCL async APIs just mean, "maybe async, maybe not" depending on the implementation.
Note, we still block in the other non query apis since those need to save to the database, Find is the only one that's blocking needlessly.
@ajcvickers Can you clarify what you mean by BCL? (Im new around her)
Base Class Libraries.
Will this bugfix be available in 3.1.7? I checked this milestone here and didn't find this PR included: https://github.com/dotnet/aspnetcore/milestone/112
Maybe that's not the right place to check the progress of upcoming releases, but I just wanted to emphasize that including it in the next release would be highly appreciated. Thank you
Most helpful comment
Will this bugfix be available in 3.1.7? I checked this milestone here and didn't find this PR included: https://github.com/dotnet/aspnetcore/milestone/112
Maybe that's not the right place to check the progress of upcoming releases, but I just wanted to emphasize that including it in the next release would be highly appreciated. Thank you