Aspnetcore: Smarter output buffering

Created on 9 Apr 2017  路  20Comments  路  Source: dotnet/aspnetcore

In order to improve application performance, we might be able to do some clever buffering that won't affect most applications. The theory is that we can buffer all output until there's no more input in the pipe. Even though there's no correlation between input and output, this shouldn't introduce bad buffering behavior in most cases and will improve pipelined requests dramatically. We can still respect the IHttpBufferingFeature feature interface if there are cases where the application isn't reading the body and is writing to the output (this is bound to break somebody).

We can bury this behind a flag and only enable it in the benchmark if we saw issues with it. Also respecting the IHttpBufferingFeature gives applications a way to the immediate write behavior.

affected-very-few area-servers enhancement servers-kestrel severity-nice-to-have

Most helpful comment

Would want to switch it off for upgraded streams (websockets)

All 20 comments

Or buffer until (packet full (1400bytes ish) || no more input) Layer 7 (Application) Nagel rather than TCP's Nagel; as TCP doesn't know if there is going to be more data (bad); whereas the Application does (good).

Would want to switch it off for upgraded streams (websockets)

And make the size bigger for tls :)

Saw 50% gain when experimenting with this before https://github.com/aspnet/KestrelHttpServer/pull/1236

The output buffering helps a lot for the plaintext benchmark. The test itself seems very artificial (100% load + all synchronous responses).
Does it happen often that output buffers can be stitched together on their way to the kernel?

@tmds TCP has Nagle which is on sockets by default to do this; its switched off by default in Kestrel as it interacts badly with delayed acks which is a harder thing to control; also introduces transmission delays.

At the TCP layer the kernel isn't aware of what will be happening next to the application code; whereas at the application layer there is more knowledge; so its mostly reintroducing the benefits of Nagle; but with added context to remove most of the disadvantages.

Here the application layer has control by flushing when a meaningful amount of data is available (e.g. http response). In the Nagle case, most of the small (1 byte) packets were not meaningful on their own.

An interesting observation is the libuv thread already delays writes. For the plaintext benchmark this coalesces the outputs of the pipelined requests. The delay on the libuv thread depends on the load of the thread which makes it somewhat self regulating. When there is more load, the delay is higher and the chances of aggregating become higher too. When the load is low, there is less aggregating (but there is also more capacity to do separate sends).

Aggregating will increase latency.

@pakrym This is lower priority than everything else, and if we can't get it done it's fine.

Add another method to pipe? SignalAsync where flushing optional for the receiver vs FlushAsync where it is compulsory?

So Commit => Signal(+Commit) => Flush(+Commit+Signal)

or add a param to FlushAsync

FlushAsync(bool allowBuffering = false)

TryFlush

I think TryFlush is too ambiguous in meaning? Would need to be TryFlushAsync as may bock.

This is contrary to TryRead which means don't block.

Actually don't you have commit vs flush? At the moment I commit the multiple messages in a "flight" and flush on the last

3 ways to "end" writing could become a mess

3 levels of granularity; each one would also do the one above

 |  Commit -> make data available
 |  Signal -> tell data is available              (backpressure + schedule)
 v  Flush  -> tell to do something with the data  (backpressure + schedule)

so what happens if you call signal and you are full (need to apply back pressure) does it basically become a flush?

Yeah; that's kind of the point. Optional flush vs compulsory flush.

So loop could be something like

while (true)
{
    var result = await Input.ReadAsync();
    var buffer = result.Buffer;

    while (!buffer.IsEmpty)
    {
        while 
        {
            // Do work
            while 
            {
                // Do sub work
                Output.Advance(bytes);
            }
            Output.Commit();
        }

        Input.Advance(buffer.End);

        // Check for more data
        if (Input.TryRead(out result))
        {
            buffer = result.Buffer;
            // Signal data is ready before processing more
            await Output.SignalAsync();
        }
    }

    // Flush before blocking for more input
    await Output.FlushAsync();

    if (result.IsCompleted)
    {
        break;
    }
}

Output.Complete();

And for a simple user scenario where you aren't too concerned about unblocking all the flows you can just drop Commit , SignalAsync and TryRead so something like

while (true)
{
    var result = await Input.ReadAsync();
    var buffer = result.Buffer;

    if (!buffer.IsEmpty)
    {
        // Do work
        Output.Advance(bytes);

        Input.Advance(buffer.End);

        // Flush before blocking for more input
        await Output.FlushAsync();
    }

    if (result.IsCompleted)
    {
        break;
    }
}

Output.Complete();

I like it but.... you need signal and flush on the pipewriter .... kinda what I am doing without the signal.

Its pseudo-code; in the non-compiling dialect they use in academic papers 馃槈

Talked to @davidfowl moving to 2.1.

/cc @muratg

Was this page helpful?
0 / 5 - 0 ratings