Roslyn: Proposal: Inline async method

Created on 2 May 2020 · 16Comments · Source: dotnet/roslyn

Currently,each async method creates one state machine.
I wanna combine state machine with nested async method invocation.
It works like F# inline methods, assembly internal only.

Area-Compilers Feature Request Resolution-Duplicate

Source

RamType0

Most helpful comment

Duplicate of #15491

sharwell on 8 May 2020

👍2

All 16 comments

I wanna combine state machine with nested async method invocation.

Why? If it's to improve performance, can you show how much this would help you in a realistic scenario?

svick on 2 May 2020

I wanna combine state machine with nested async method invocation.

Why?

For performance.

If it's to improve performance, can you show how much this would help you in a realistic scenario?

Here's test project.

I got this kind of result.

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.18363.778 (1909/November2018Update/19H2)
Intel Core i9-9900K CPU 3.60GHz (Coffee Lake), 1 CPU, 16 logical and 8 physical cores
  [Host]        : .NET Framework 4.8 (4.8.4150.0), X64 RyuJIT
  .NET 4.8      : .NET Framework 4.8 (4.8.4150.0), X64 RyuJIT
  .NET Core 3.1 : .NET Core 3.1.3 (CoreCLR 4.700.20.11803, CoreFX 4.700.20.12001), X64 RyuJIT
  Mono 6.8.0    : Mono 6.8.0 (Visual Studio), X64

| Method | |---------------- | NestedNoAsync | | InlineNoAsync | | NestedYieldOnce | | InlineYieldOnce | | NestedYieldEach | | InlineYieldEach | | NestedNoAsync | InlineNoAsync | NestedYieldOnce | InlineYieldOnce | NestedYieldEach | InlineYieldEach | NestedNoAsync | | InlineNoAsync | | NestedYieldOnce | | InlineYieldOnce | | NestedYieldEach | | InlineYieldEach | Job | Runtime | Mean | Error | StdDev |
|-------------- |-------------- |--------------:|-------------:|-------------:|
.NET 4.8 | .NET 4.8 | 1,072.12 ns | 1.803 ns | 1.599 ns |
.NET 4.8 | .NET 4.8 | 47.65 ns | 0.257 ns | 0.214 ns |
.NET 4.8 | .NET 4.8 | 6,879.69 ns | 125.388 ns | 117.288 ns |
.NET 4.8 | .NET 4.8 | 1,560.72 ns | 20.274 ns | 17.973 ns |
.NET 4.8 | .NET 4.8 | 41,376.44 ns | 358.283 ns | 335.138 ns |
.NET 4.8 | .NET 4.8 | 29,016.23 ns | 84.470 ns | 79.013 ns |
| .NET Core 3.1 | .NET Core 3.1 | 564.90 ns | 0.638 ns | 0.597 ns |
| .NET Core 3.1 | .NET Core 3.1 | 18.12 ns | 0.230 ns | 0.215 ns |
| .NET Core 3.1 | .NET Core 3.1 | 3,217.08 ns | 31.706 ns | 29.658 ns |
| .NET Core 3.1 | .NET Core 3.1 | 849.34 ns | 3.877 ns | 3.238 ns |
| .NET Core 3.1 | .NET Core 3.1 | 28,945.69 ns | 577.294 ns | 809.285 ns |
| .NET Core 3.1 | .NET Core 3.1 | 20,666.05 ns | 166.498 ns | 155.742 ns |
Mono 6.8.0 | Mono 6.8.0 | 891.03 ns | 2.358 ns | 2.206 ns |
Mono 6.8.0 | Mono 6.8.0 | 40.26 ns | 0.035 ns | 0.029 ns |
Mono 6.8.0 | Mono 6.8.0 | 16,797.55 ns | 311.562 ns | 305.996 ns |
Mono 6.8.0 | Mono 6.8.0 | 10,095.49 ns | 200.777 ns | 396.314 ns |
Mono 6.8.0 | Mono 6.8.0 | 178,924.15 ns | 538.445 ns | 503.661 ns |
Mono 6.8.0 | Mono 6.8.0 | 157,137.39 ns | 3,116.256 ns | 8,154.698 ns |

XXXNoAsync
It is marked as async,but it completes synchronously.
In this pattern, no state machines would be created in heap.(Thanks for ValueTask)
XXXYieldOnce
Call "Task.Yield" and awaits it once. The most blatant, but still realistic scenario(Because async is "infectious").
- InlineYieldOnce creates one state machine in heap,and continuations are invoked from delegate once.
- NestedYieldOnce creates 21 state machines in heap,and continuations are invoked from delegate 21 times.
XXXYieldEach
Call "Task.Yield" twenty times and awaits it each times.

Each NestedXXX and InlineXXX method doing exactly same things.
But in every scenario, InlineXXX was definitely faster.

RamType0 on 2 May 2020

It's unclear what is being asked for here. Can you explain what codegen you would want given an existing code pattern that is written?

CyrusNajmabadi on 2 May 2020

Currently ,if we write this kind of code,these are 5 state machine and 5 tasks created every time we call ShowImageAsync.

```C#

async ValueTask ShowImageAsync(string url)
{
var blob = DownloadBlobAsync(url);
await RenderImageAsync(blob);
}

async ValueTask {
var connection = await ConnectAsync(url);
return await RequestBlobAsync(connection,url);
}

async ValueTask ConnectAsync(string url)
{
ConnectAsync code
}
async ValueTask {
RequestBlobAsync code
}
async ValueTask RenderImageAsync(ReadOnlyMemory blob)
{
RenderImageAsync code
}

We could reduce number of tasks,async state machines by this kind of code generation.

```C#

async ValueTask ShowImageAsync(string url)
{
   var connection = await  ConnectAsync code;
   var blob = await RequestBlobAsync code;
   await RenderImageAsync code;
}

async ValueTask<byte[]> DownloadBlobAsync(string url)
{
   var connection = await  ConnectAsync code;
   return await RequestBlobAsync code;
}

async ValueTask<Connection> ConnectAsync(string url)
{
   ConnectAsync code
}
async ValueTask<byte[]> RequestBlobAsync(Connection connection,string url)
{
   RequestBlobAsync code
}
async ValueTask RenderImageAsync(ReadOnlyMemory<byte> blob)
{
   RenderImageAsync code
}

With this code generation, only one state machine and task are created every time we call ShowImageAsync.

RamType0 on 2 May 2020

We could reduce number of tasks,async state machines by this kind of code generation.

I don't see how that would be ok. it would be inlining all the code (transitively) of all async methods called. That would greatly increase the side of htese methods.

CyrusNajmabadi on 3 May 2020

alrz on 3 May 2020

That would greatly increase the side of htese methods.

You mean the size of these methods?
As I said at the beginning,it works like F# inline methods.
Yeah,F# is already doing such a thing,and is it said that F# inline method's assembly size increasing are harmful?

RamType0 on 3 May 2020

it would be inlining all the code (transitively) of all async methods called. That would greatly increase the side of htese methods.

Instead of inlining the whole code, can't we just pass the state machine instance to the nested method, so it could modify the state on an existing state machine.

meaning: we still have method boundaries but all of them operate on a single state machine.

alrz on 3 May 2020

Duplicate of #22052

jinujoseph on 7 May 2020

cc @genlu

sharwell on 7 May 2020

@RamType-0 we are interested in providing an "inline method" feature more broadly. This issue captures one possible use case, but there are many different teams that could benefit from having this available in different ways. I believe we are starting fairly conservatively with the implementation approach, but would be interested in expanding coverage to new scenarios as they are identified with use of the feature.

sharwell on 7 May 2020

@jinujoseph @sharwell How is this a duplicate? https://github.com/dotnet/roslyn/issues/22052 is about a refactoring (i.e. something that changes the C# code in the editor), while, as I understand it, this issue is about a compiler feature (i.e. something that keeps the C# code intact, but changes the generated IL).

Since the two have completely different goals (one is all about removing abstraction, the other is about maintaining it), I don't see how one could be a duplicate of the other.

svick on 8 May 2020

@svick I was going based on the information in the first post, which I see is ambiguous. If this is a request for the compiler to automate this inlining (as opposed to a refactoring of the code itself), then this would be a duplicate of #15491.

sharwell on 8 May 2020

👍1

yeah, this is asking for a compiler optimization I do not think this should be tracked as part of #22052