_From @daddyman on July 19, 2018 21:47_
Since moving to .NET Core 2.1 we have been seeing our aspnet core web api spike to 99% CPU usage on Linux. This does not happen when running on Windows.
What we observe in our application is if we make 4 or 5 API calls in quick succession then the CPU goes to 99% and stays there for a few minutes.
I was able to strip down our app and reproduce the issue in a simple app. The repo is here: https://github.com/daddyman/dotnetcorecpu
I am running .NET Core 2.1.2 using SDK 2.1.302 on Ubuntu 16.04.
It seems to have something to do with "await". In the reproduction case I have one controller that handles a POST.
await Task.CompletedTask; does not cause a problem.away DummyWait(); where DummyWait is return Task.Delay(1); causes high CPU after hitting the server with many requests in a short time. The requests complete quickly but then the CPU goes to 99% and stays there.In our real app I did the same change of using the await Task.CompletedTask; and 'await Task.DummyWait();` and saw the same difference with the CPU usage.
_Copied from original issue: dotnet/corefx#31208_
_From @davidsh on July 19, 2018 21:53_
cc: @Tratcher
_From @joshfree on July 19, 2018 21:57_
@brianrob could you PTAL and help route?
I updated the test app. Using a Task.Delay(100); makes the CPU spike happen quicker. I am guessing that causes the await to actually need to "wait" more often than the Task.Delay(1);.
In our app this is where we access the database using an await call which is going to need "wait" almost every time. Changing that code to do await Task.CompletedTask; never causes the CPU spike.
I have a (more complicated) application that is seeing similar 100% CPU spikes that occur on awaited calls. I never see the problem when running directly on Windows. I see it only when running in a docker container (which is built on top of the microsoft/dotnet:2.1-aspnetcore-runtime container).
I see the problem with SDK 2.1.301 and runtime 2.1.1.
When I downgrade to SDK 2.1.4 and runtime 2.0.5, the problem goes away.
@davidfowl - does ASP.NET Core do anything at all that might affect this?
FYI, I'm also looking at this, but haven't been able to repro it thus far.
Thanks @brianrob , please do post anything you find out, even if it's nothing 馃槃
A follow-up in response to @daddyman's comment:
In our app this is where we access the database using an await call which is going to need "wait" almost every time.
This is the case with me too. The await that triggers the spike in my code is on a call to the StackExchange.Redis SortedSetIncrementAsync (ZINCRBY) call. If I replace that with a synchronous call, the spike inevitably starts on a different await call.
Another piece of information: Once the 100% CPU spike hits, the await call that triggered it blocks for exactly 2 minutes so long as my application performs no additional awaited calls (via other controllers). Once the 2 minutes elapse, the CPU drops back to near-zero and the awaited call returns. The 2 minute-long spike is consistent; it's never more or less than 2 minutes unless I issue another awaited call from a different controller.
edit: I should add that the blocking await call that triggers the CPU spike has in fact successfully updated the database -- almost immediately -- with the transaction it was attempting to complete. The awaited call should have returned right away;; it's as if the CPU spinning is tied to the call's inability to return. At first I thought this might be a problem with the StackExchange.Redis library. But since @daddyman is also seeing something similar without using that library, it makes me think the problem might be .NET core 2.1.
@brianrob When using the test app I posted and the await is set to use Task.Delay(100); I see it happen when I have Insomnia up and spam the server by pressing Send very quickly multiple times. Sometimes I see if happen after just 3 or 4 sends and sometimes I have a do it over and over for 20 or 30 seconds.
I tried to use curl to reproduce but it never triggered the 99% CPU.
Thanks @daddyman, I can now reproduce this. I've just collected a trace and am looking at it now.
I've had a chance to repro this and try to understand the behavior a bit. What I see are after "enough" POST requests, Pairs of threads will saturate CPUs. Most commonly, I saw just a single pair (2) threads saturate two CPUs, but I did once encounter two pairs (4) threads saturate four CPUs.
Looking at the trace, the functions where the bulk of the CPU time are spent are:

Drilling into Pipe::ReadAsync shows that there is contention on this stack as well as a bunch of CPU time:

If I look at the time spent in managed code Monitors, it is coming from two methods (ReadAsync and AdvanceReader):

Note that this only seems to occur when I attempt to create contention by sending multiple requests at the same time, but the spike lasts for around a minute on my machine. The time spent in the spike seems to be attempting to process one or more requests and is slowly chipping away at them (with some contention).
It looks like this investigation should be continued by folks who have some familiarity with Pipelines.
@pakrym
@halter73
@Tratcher, yes, we have similar problems
@Tratcher Doing UseLibuv() in our web host builder chain also fixed my problem.
As @beevik mentioned switching to using libuv as the transport fixed the problem for us.
I got information on switching back to libuv from here: https://blogs.msdn.microsoft.com/webdev/2018/04/12/asp-net-core-2-1-0-preview2-improvements-to-the-kestrel-http-server/
I'm closing this in favour of https://github.com/aspnet/KestrelHttpServer/issues/2694