Runtime: WebSocketProtocol.CreateFromStream should allow specifying a receive buffer and a send buffer

Created on 16 Mar 2018  路  28Comments  路  Source: dotnet/runtime

Currently you can specify a receive buffer to the WebSocketProtocol.CreateFromStream API but you cannot control the send buffer. Currently the managed websocket implementation always allocates a pooled array for sends which may not be optimal if the underlying stream is already buffered (it introduces an extra layer of unnecessary buffering).

I'd like to propose we change this method to the following:

```C#
using System.IO;

namespace System.Net.WebSockets
{
public static class WebSocketProtocol
{
public static WebSocket CreateFromStream(Stream stream, bool isServer, string subProtocol, TimeSpan keepAliveInterval, int? receiveBufferSize = null, int? sendBufferSize = null) { throw null; }
// Existing API:
public static WebSocket CreateFromStream(Stream stream, bool isServer, string subProtocol, TimeSpan keepAliveInterval, Memory buffer = default) { throw null; }
}
}
```

I'd also like to get this in for 2.1 if possible 馃槃

A few questions:

  • Today we allow passing a single buffer to the websocket API for receives but if you use a pooled buffer, you can only return it when the websocket is disposed. That would be the same deal for the sendBuffer.
  • The existing WebSocket.CreateClient API just lets you specify the size of the internal buffer instead of taking a buffer directly. While this is less flexible, I'm not sure how useful it is to allow passing in the buffer itself to the WebSocket constructor. I might be missing something here.

Note: Existing API:
https://github.com/dotnet/corefx/blob/cc590fd30b7000458d055164c641461d616ed33c/src/System.Net.WebSockets.WebSocketProtocol/ref/System.Net.WebSockets.WebSocketProtocol.cs#L14

/cc @stephentoub @karelz

api-approved area-System.Net

Most helpful comment

Decision here, at least for now, is to simply remove the buffer argument. The ManagedWebSocket implementation now just creates a small buffer for use with headers and control payloads, but actual message payloads are written directly to the caller's buffer passed to ReceiveAsync. If subsequently we determine we need to add overloads that provide more fine-grained control here and for how sends are handled, we can open a new issue (sends are also complicated by the need, at least on the client, to apply a mask to the data, and the supplied buffer sent into SendAsync is read-only, so the masking needs to happen in a temporary anyway).

All 28 comments

I'd like to propose we change this method to the following

In general I'm fine with that. We made it just a single scratch buffer with the idea that the implementation could use it however it wanted, for receive, for send, or partitioned for both, even though currently the implementation only uses it for receives.

That said, how do you propose the implementation used the provided send buffer? From our conversation yesterday, I thought you didn't want the WebSocket buffering sends at all, and this seems to be going in the opposite direction. Today the implementation rents a buffer for the whole message to be sent; what happens if the provided user buffer isn't sufficient?

And providing a buffer here means it'll be held onto/owned for the lifetime of the WebSocket. Doesn't that go against your desire to minimize held state while the WebSocket is idle?

Today we allow passing a single buffer to the websocket API for receives but if you use a pooled buffer, you can only return it when the websocket is disposed. That would be the same deal for the sendBuffer.

What's the question?

I'm not sure how useful it is to allow passing in the buffer itself to the WebSocket constructor. I might be missing something here.

I'd have thought this would be something you'd really want, so that you can control from what pool the memory comes from, whether it's prepinned, etc.

The existing WebSocket.CreateClient API just lets you specify the size of the internal buffer instead of taking a buffer directly.

It actually does both.

That said, how do you propose the implementation used the provided send buffer? From our conversation yesterday, I thought you didn't want the WebSocket buffering sends at all, and this seems to be going in the opposite direction. Today the implementation rents a buffer for the whole message to be sent; what happens if the provided user buffer isn't sufficient?

I planned to pass Memory<byte>.Empty for sends. For receives we're passing in the minimum header length (14 bytes).

And providing a buffer here means it'll be held onto/owned for the lifetime of the WebSocket. Doesn't that go against your desire to minimize held state while the WebSocket is idle?

Yes, on further thinking maybe the send buffer should be a buffer size instead of a buffer itself. We would still pass 0 to turn buffering off completely but then it would let the internal implementation still rent up to that amount from the array pool if it wanted to. That way you don't have a long lived buffer hanging around for the lifetime of the websocket.

What's the question?

See above, I think it makes less sense for the sendBuffer than the receive buffer (though I guess the same argument can be made).

I'd have thought this would be something you'd really want, so that you can control from what pool the memory comes from, whether it's prepinned, etc.

I think this makes more sense if I didn't have to pass in a receive buffer for the payload as well. Right now to get optimal behavior we're trying to do a couple of things:

  • Use a 14 byte receive buffer to avoid allocating the default 4K buffer per idle websocket
  • We're doing 0 byte receives so we don't allocate anything until there is data, then a follow up receive with pooled memory.
  • When running on Kestrel there's no need to use pre-pinned buffers since the underlying Stream isn't pinning user buffers (we're incurring a copy here because of a "pass in a buffer paradigm").

Now if you were using this against a NetworkStream directly which does want to pin the handed in memory, you can do it today for the payload but not for the header. Same goes for sends (payload only not the outer framing).

It actually does both.

Ah, good to know. I still think it makes less sense when using a pool 馃槃. You end up taking memory out of the pool for a long time (lifetime of the socket vs lifetime of any particular operation).

Now I think it could make sense to pass both a MemoryPool and a size but since each operation requires passing in a buffer anyways IMO that would only make sense if we ended up doing the CopyToAsync overload or added new Read/WriteAsync overloads that used ValueTask and didn't require the user to pass any buffer at all.

Yes, on further thinking maybe the send buffer should be a buffer size instead of a buffer itself. We would still pass 0 to turn buffering off completely but then it would let the internal implementation still rent up to that amount from the array pool if it wanted to.

I'm ok with that.

We're doing 0 byte receives

How are you doing the 0-byte receive? You're doing it on the underlying connection for the stream you passed to WebSocket, and then only calling WebSocket.ReceiveAsync when you know there's something there to read?

Use a 14 byte receive buffer to avoid allocating the default 4K buffer per idle websocket

You mean that's what you want to do, rather than what you're currently doing, as that will require a change in the implementation to do well, no?

If we need a change to API, we should finalize it ASAP. We will also need shiproom approval (we are past API freeze).
How big change to the implementation do we expect? Couple of hours of work, or more?

How are you doing the 0-byte receive? You're doing it on the underlying connection for the stream you passed to WebSocket, and then only calling WebSocket.ReceiveAsync when you know there's something there to read?

the plan is to do it at both layers. Using libuv this is already the case and we're now trying to add it to the socket transport https://github.com/aspnet/KestrelHttpServer/pull/2393. SignalR also has to do it at the websocket layer https://github.com/aspnet/SignalR/pull/1603. The chain looks like this:

WebSocket.ReceiveAsync(0)
   HttpUpgradeStream.ReceiveAsync(14) // This is the internal buffer for the websocket header
       RequestBodyPipeReader.ReadAsync()  // This is waiting on data to be written to the transport input
          Transport.Input.ReadAsync()   // This is waiting on data to be read out of the socket.
              socket.ReceiveAsync(0)

You mean that's what you want to do, rather than what you're currently doing, as that will require a change in the implementation, no?

That's what the implementation lets us do right now.

@karelz If we need a change to API, we should finalize it ASAP. We will also need shiproom approval (we are past API freeze).
How big change to the implementation do we expect? Couple of hours of work, or more?

Agreed, it's not that much work. Probably a day with tests but @stephentoub can better estimate.

Probably a day with tests but can better estimate.

Yeah, that's about right, maybe two days, as whatever's done here needs to work well for both ASP.NET and ClientWebSocket, which might mean tweaking SocketsHttpHandler as well, depending on the design that's landed on here.

It does? Not efficiently:

Not as efficiently as it could but we're working with the API patterns we have. The websocket doesn't yield until the header is read and when we pass in a 0 byte payload buffer, it returns telling us there is data and then we read again. Unfortunately it doesn't communicate the payload length to the caller, that could let us drain the receive without doing 0 byte reads in between but it's something we can live with for now.

Not as efficiently as it could but we're working with the API patterns we have.

I mean not efficiently at all. If you give the WebSocket a 14-byte receive buffer when calling CreateFromStream, with the current implementation as far as I see you'll never get more than 14 bytes at a time in any call to ReceiveAsync.

Ugh, I thought I read that incorrectly. Well that's terrible and we need to fix that...

That's what I'm saying. It wasn't written expecting that access pattern; if that's the access pattern that's going to be used with it, the implementation needs to change to accomodate.

Thanks for the heads up, we can't merge that PR until this is fixed. Should I open another issue?

Should I open another issue?

Sure. And then maybe a PR? :smile:

Updated the proposal.

Updated the proposal.

Thanks. Can you also update the proposal with what you expect the behavior to be for null buffer sizes and then the buffer sizes are smaller than the implementation wants?

We're generally fine with the API shape, however:

  • At this point in the cycle we need shiproom approval for API breaking changes. @davidfowl should work with @karelz to get it approved.
  • We need to clarify the semantics for the buffer sizes that are supplied (0, too small, etc). @davidfowl follow up with @karelz to get this locked in.

Once both things are settled, @karelz will just mark this issue as approved.

@muratg are you ok with this change so late?

@karelz It should be OK from our side.

OK, let's get clarification on the above first (@davidfowl?).
Then let's get shiproom approval (I can help).
Then who's going to make the changes in CoreFX and ASP.NET? @davidfowl or someone on @muratg's team?

@karelz Thanks! Yeah, we have PRs open in both Kestrel and SignalR where this can be added (or later, based on when this can go in.) @davidfowl or someone in my team can handle ASP.NET changes, but I'm unsure about CoreFX.

I think we can take on both sides. Give me some time and I'll come up with a plan here in a few days (1-2).

I think for 2.1 we'll do this:

It just means we won't be able to avoid the send buffer in the short term

And we'll be shipping a new API we don't believe in. If we don't think this new API added in 2.1 is right, let's try hard to get it fixed for 2.1.

Decision here, at least for now, is to simply remove the buffer argument. The ManagedWebSocket implementation now just creates a small buffer for use with headers and control payloads, but actual message payloads are written directly to the caller's buffer passed to ReceiveAsync. If subsequently we determine we need to add overloads that provide more fine-grained control here and for how sends are handled, we can open a new issue (sends are also complicated by the need, at least on the client, to apply a mask to the data, and the supplied buffer sent into SendAsync is read-only, so the masking needs to happen in a temporary anyway).

Affected APIs - stripping the optional buffer argument:

```c#
namespace System.Net.WebSockets
{
public class WebSocket
{
// Current API shape:
public static WebSocket CreateFromStream(Stream stream, bool isServer, string subProtocol, TimeSpan keepAliveInterval, Memory buffer = default) {}
// New proposed API shape:
public static WebSocket CreateFromStream(Stream stream, bool isServer, string subProtocol, TimeSpan keepAliveInterval) {}
}
}

// Note: OOB assembly System.Net.WebSockets.Protocol shipping only for ASP.NET to enable ASP.NET targeting netstandard - new in 2.1
namespace System.Net.WebSockets
{
public static class WebSocketProtocol
{
// Current API shape:
public static WebSocket CreateFromStream(Stream stream, bool isServer, string subProtocol, TimeSpan keepAliveInterval, Memory buffer = default) {}
// New proposed API shape:
public static WebSocket CreateFromStream(Stream stream, bool isServer, string subProtocol, TimeSpan keepAliveInterval) {}
}
}
```

Assigned you @stephentoub I've already reacted to the ASP.NET change

@davidfowl do you mean that ASP.NET does not use the buffer arg anymore? Can you please link the PR(s) for tracking purposes? Thanks!

Was this page helpful?
0 / 5 - 0 ratings