I'm having a weird issue with the new "Sockets" transport and I hope you might have some ideas!
I'm running the latest .NET Core 2.1.1 / ASP.NET Core 2.1.1. I have a single-node Service Fabric cluster for development purposes on Azure (Windows, 2 cores, 16GB RAM) and it runs ~ 60 .NET Core processes (self-contained, win10-x64). About half of them are ASP.NET Core applications, the rest is regular console applications with background services. The applications are communicating via the Service Fabric HTTP reverse proxy on localhost (using HttpClient).
As it's a dev/demo system, there's almost no load on it. However, when using the new "Sockets" transport, I get high CPU usage (50% - 100%) on the ASP.NET core apps for quite a few seconds every few minutes which results in a very unstable system.
As you can see in the overall CPU usage of the machine, the system became much more stable after I switched back to Libuv. I afterwards tried Sockets() with one app and CPU usage became unstable again.

I've been running PerfView and it shows that the high CPU usage happens around these calls:
module Microsoft.AspNetCore.Server.Kestrel.Core <<Microsoft.AspNetCore.Server.Kestrel.Core!Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.HttpProtocol+<ProcessRequests>d__186`1[Microsoft.AspNetCore.Hosting.Internal.HostingApplication+Context].MoveNext()>>
module system.io.pipelines <<system.io.pipelines!System.IO.Pipelines.Pipe+DefaultPipeReader.ReadAsync(value class System.Threading.CancellationToken)>>
module system.io.pipelines <<system.io.pipelines!System.IO.Pipelines.Pipe+DefaultPipeReader.AdvanceTo(value class System.SequencePosition,value class System.SequencePosition)>>
Here's some screenshots. Calls by name:

Call tree:



_Process CPU usage with libuv:_
As I've said there's almost no load on the app/system.

_Process CPU usage with Sockets transport:_
Some spikes just take 1-2 seconds, others take a lot longer.

Any ideas on a possible cause?
PS: I don't know much about PerfView so I hope that what I've done is not misleading.
Pinging @pakrym just because the high exclusive sample sizes from DefaultPipeReader.ReadAsync and DefaultPipeReader.Advance inside ProcessRequests.
My guess is that the client is sending request data slowly enough that Kestrel tries to repeatedly reparse small segments of the request start line and headers, and for some reason, the libuv transport does a better job of batching data in order to reduce calls to the socket receive callback and thereby also the http parser.
@cwe1ss You could verify this guess by doing a tracing instead of a sampling based profile and looking at the invocation count of DefaultPipeReader.ReadAsync with the libuv vs socket transport given the similar load where the socket transport is performing worse. I don't know if PerfView can collect method invocation counts, but this is easy with dotTrace.
Alternatively, if you could give us sample server and client apps that repro these Socket transport CPU usage spikes, that would be even better.
@halter73 thank you very much for your response!
I did find a few exceptions in our logs regarding slow reading of the request body. However, there have only been ~ 15 exceptions in total in the last 7 days, so much less than the number of CPU spikes we've had (every few minutes per app). So it's hard to say if those timeouts were a potential cause or just a sign of high CPU load during these spikes.
Microsoft.AspNetCore.Server.Kestrel.Core.BadHttpRequestException
Reading the request body timed out due to data arriving too slowly. See MinRequestBodyDataRate.
Also, I did not get any of these exceptions since I switched back to Libuv. Everything runs quite smoothly since the switch:

Note that I currently also have another issue with Application Insights that results in slowly increasing CPU usage / a memory leak - hence the small rise in response duration at the end. (Using the latest and greatest can be pretty tough some times...) However, I've disabled Application Insights for one of my apps and I've had the same CPU spikes with Sockets so I don't think they're related.
I guess it will be really hard to reproduce this in a sample app. 馃槥 I do have a dotTrace license so I could do some profiling with it, however, since I'm using Service Fabric I can't easily launch my apps with dotTrace on the server. I quickly tried remote profiling but it only showed full .NET applications. My experience with dotTrace is very limited unfortunately. Do you know of an easy way to do this with Service Fabric / .NET Core?
Your help is much appreciated.
Having same issue on Ubunu 16.04. Local Windows test passed without any problems.
After many concurrent requests we see that 1 or 2 of 8 cores fully utilizing, even if request already processed. Utilizing CPU completes only if we see in logs:
_Connection id "0HLF6G2D3H2RB" sending FIN.
dbug: Microsoft.AspNetCore.Server.Kestrel[2]
Connection id "0HLF6G2D3H2RB" stopped._
All tracing data related to reading from Pipes and AsyncTaskBuilder.
At 100 rps we see:

I've also ran into this issue.
For our application, the requests should all be fairly quick because they're all within a single AWS region, not over the internet.
The application only handles a couple different request types, and only handles probably around 5 requests per second.
It's running in a Debian 9 container in ECS.
This CPU graph is averaged between 3 containers.
. It's going above 100% cpu because it's allowed to burst over the CPU reservation we have assigned to it.
I tried adjusting the MinRequestBodyDataRate to 1 byte per second, with a 10 second grace period and it didn't have any noticeable change. Switching to libuv does appear to fix the problem.
Here's a flamegraph sampiling over 45 seconds.

Also have this problem. cpu usage is stable after switch back to Transport.Libuv.

This image shows the cpu usage(%) 2.1.1 default transport and libuv.
@pakrym are you reproduce it ?
Also having this problem in Azure App Services. Is the workaround for now to .UseLibuv() from program main?
Calling .UseLibuv() is a workaround for this issue.
I have attempted to bring in the .UseLibuv() workaround to fix problems we might be able to attribute to this. Upon bringing the nuget package in and applying the workaround, deploying to an azure web app, and letting it go we see almost full 100% cpu utilization on the app service plan. The same utilization is not seen running it locally.
.NET Core 2.1.2
all latest Microsoft.AspNetCore packages
all latest Microsoft.ApplicationInsights packages
For AppInights are you using version 2.4.0? https://github.com/Microsoft/ApplicationInsights-aspnetcore/issues/690#issuecomment-411968077
We have a fix for this in our next patch release (2.1.4, September.)
@muratg, @pakrym any commit/text description of problem ?
I am having this same issue with the Sockets transport in Debian9 with sdk 2.1.4, .net core 2.1.2
Fixed by switching to libuv.
There was a bug in System.IO.Pipelines that caused ReadAsync calls not to "block" in some cases resulting in a tight loop and CPU spike until more data was written by the client.
Thanks for looking into this @pakrym! Is this fixed with the most recent updates (ASP.NET 2.1.4, System.IO.Pipelines 4.5.1) or will this be included in a future release?
@cwe1ss yes, the fix is in 2.1.4/4.5.1.
Fixed in 2.1.4.
Whew! Stuck on this for two weeks, thanks for the update! Problems went away after upgrading, see https://stackoverflow.com/questions/52561063
This may sound dumb, but fixed in version 2.1.4 of what, exactly? The .net core runtime? sdk? some other library?
@MatthewLymer The fix is in the ASP.NET Core runtime 2.1.4. You can get it via installing the latest 2.1 SDK or install the .NET Core and ASP.NET Core runtimes directly.
I upgraded the runtime on my Ubuntu 16.04 boxes to `dotnet core runtime 2.1.5.
Libuv seems to work fine on either 2.1.2 and 2.1.5, but Sockets seems to be bad on either (I don't have test results for 2.1.2, but they're just as bad).
Is there something else I need to do to have the performance issue resolved?
dotnet --info
Host (useful for support):
Version: 2.1.5
Commit: 290303f510
.NET Core SDKs installed:
No SDKs were found.
.NET Core runtimes installed:
Microsoft.NETCore.App 2.1.5 [/usr/share/dotnet/shared/Microsoft.NETCore.App]
To install additional .NET Core runtimes or SDKs:
https://aka.ms/dotnet-download

EDIT: internal and external refer to different applications, though both using same version of .net core
@MatthewLymer Could you also install ASP.NET Core runtime there as well? Min version 2.1.4, but since 2.1.5 is also out, just install that. You want both runtimes to be the latest patch in general.
Sorry I have misguided you above. Amended the post with this.
@muratg I'll give that a try, I am at a bit of a loss to understand what the different runtimes are for, my
(aspnetcore) app seems to behave properly with the regular dotnet core runtime, is there any documentation for what the aspnetcore runtime does specifically, and why kestrel would presumably behave differently from the two?
@MatthewLymer It's all about layering. ASP.NET Core runtime depends on .NET Core runtime, and it carries ASP.NET optimized binaries for its target platform.
If you install the SDK, it brings in all the runtimes. But some folks run .NET workloads without ASP.NET and they may prefer not to bring in ASP.NET at all.
@MatthewLymer If you decide you want to install just the runtime instead of the full SDK, you can find all the install links/instructions for ASP.NET Core here. And here are the Ubuntu 16.04 specific install instructions.
If you do this dotnet --info will output will include the 2.1.5 Microsoft.AspNetCore.App runtime:
dotnet --info
A compatible SDK version for global.json version: [2.2.100-preview2-009404] from [/home/shalter/aspnet/KestrelHttpServer/global.json] was not found
Host (useful for support):
Version: 2.1.5
Commit: 290303f510
.NET Core SDKs installed:
No SDKs were found.
.NET Core runtimes installed:
Microsoft.AspNetCore.All 2.1.5 [/usr/share/dotnet/shared/Microsoft.AspNetCore.All]
Microsoft.AspNetCore.App 2.1.5 [/usr/share/dotnet/shared/Microsoft.AspNetCore.App]
Microsoft.NETCore.App 2.1.5 [/usr/share/dotnet/shared/Microsoft.NETCore.App]
To install additional .NET Core runtimes or SDKs:
https://aka.ms/dotnet-download
Also, if you run lsof -p <PID of your "dotnet exec" process> you should see that Microsoft.AspNetCore.Kestrel.Transport.Sockets.dll is loaded from the /usr/share/dotnet/shared/Microsoft.AspNetCore.App/2.1.5/ folder. I would double check this is the case so you know you're using a version of the dll that contains the fix for this issue. Currently, I suspect you're loading the dll from a self-contained app.
@halter73 I suspect you're right about the self-contained app (I'll check tomorrow). It seems very confusing to me to have two-ways of achieving the same thing (self contained vs runtime), does this exist simply for backwards compatibility sake and the runtime is the way to go in the future?
When I build my server images I just ended up installing the regular dotnet core runtime as my application just worked. There wasn't any documentation that indicated that installing a different runtime with the same version number would fix performance crippling bugs, so I went with the minimal installation necessary to get my application running.
In my csproj when I import via nuget the Microsoft.AspNetCore.App, does this not dictate that I will be executing the self contained binary? Or is there some mechanism behind the scenes that says it'll actually use another assembly all together?
@MatthewLymer Whether or not your app is self-contained usually depends on what parameters you pass into dotnet publish. If you use the -r flag (e.g. dotnet publish -c release -r ubuntu.16.04-x64), you'll see that your app along the entire runtime is published to ./bin/release/netcoreapp2.1/ubuntu.16.04-x64/publish. For the default web template, the size of this directory totals to about 96MB! The nice thing is that if you make sure ./bin/release/netcoreapp2.1/ubuntu.16.04-x64/publish/myapp has the executable bit set you can just deploy it on any Ubuntu 16.04 x64 machine without needing to install the .NET runtime or the ASP.NET runtime.
If you omit the -r flag (e.g. dotnet publish -c release), you'll see just your app is published to ./bin/release/netcoreapp2.1/publish/. For the default web template, the size of this directory totals to about 250KB. You can use this directory to run your app on any machine with the compatible runtime installed no matter what OS or bitness. Also if you update your server with the latest patch, every app running on the server that depends on the system-installed runtime gets the update without redeploying.
@halter73 I am definitely omitting the -r flag and I do get about ~50 Microsoft.AspNetCore.* assemblies there totalling ~4.5mb.
If I run this on a server w/ the AspNetCore runtime then these assemblies are not used (and instead newer better ones)? If so, is it because these newer binaries are implicitly loaded by the runtime when starting an application, or is there something else that ensures the non-packages ones are used?
If I were to stay with the regular dotnet core runtime, would it be possible to update my project to include the fixed version of System.IO.Pipelines and get all this goodness in a more obvious (to me) manner?
@MatthewLymer Check this out: https://docs.microsoft.com/en-us/dotnet/core/deploying/ for more details. Thanks.
Thanks for the info, sorry for derailing the issue
@MatthewLymer we all waiting you for new AWS graphs with aspnetcore-runtime and sockets transport
Woop woop! Migrating to aspnetcore 2.1.5 made things better than before!

related commit, fyi
https://github.com/dotnet/corefx/commit/995dea0d6cd1546e7d3c144dbe73454c1df8f3aa
@pakrym, thanks
Most helpful comment
We have a fix for this in our next patch release (2.1.4, September.)