Aspnetcore: Support System.Net.Connections.ConnectionListenerFactory-based transports and middleware (mechanism to apply back pressure to accepting connections)

Created on 20 Aug 2019  路  19Comments  路  Source: dotnet/aspnetcore

Is your feature request related to a problem? Please describe.

To match functionality with WCF on .NET Framework, CoreWCF needs to be able to control the number of pending accepts. When a client connects to WCF using the NetTcp transport, there is a handshake that needs to be completed which includes authentication and security upgrade negotiation. After a connection has been established and while the handshake is ongoing, a connection is considered to be a pending connection. WCF controls the maximum number of connections which are in the pending connection state by controlling the maximum number of pending accepts to never be more than how many more pending connections are possible. For example, if the maximum number of pending connections is configured to be 100 and there are currently 95 pending connections, we limit the number of pending accepts to be no more than 5.

Describe the solution you'd like

I believe there should be a generic throttling interface that can be added by DI for use at any layer within an app, and a derived interface (which adds no extra methods) which is used at the transport layer. The two interfaces would be defined as:
```c#
public interface IThrottle
{
ValueTask AcquireAsync(CancellationToken cancellationToken = default(CancellationToken));
void Release();
}

public interface ITransportThrottle : IThrottle { }
```
When calling AcquireAsync, one of four things can happen.

  1. The returned ValueTask is completed with a value of true meaning the throttle was acquired.
  2. The returned ValueTask is not completed and the call goes async. When a throttle can be acquired, the ValueTask is completed with a value of true meaning the throttle was acquired.
  3. The returned ValueTask is completed with a value of false meaning the throttle can not be acquired immediately. This would be used for a throttle implementation which is used when you want to fail immediately such as returning a 503 throttled HTTP response.
  4. The returned ValueTak is not completed and the call goes async. If the cancellation token is cancelled, the ValueTask is completed with a value of false meaning the throttle was not acquired. This would be used for scenarios where waiting for a throttle can time out or when shutting down the application and you need to complete the pending AcquireAsync call for cleanup.

In the code which calls Accept on the listening socket, if an ITransportThrottleexists in DI, in a loop it acquires the throttle and adds another pending accept until ITransportThrottle.AcquireAsync returns a non-completed ValueTask. It would be up to the implementation of the ITransportThrottleto ensure there's a sensible number of pending accepts. Basically the number of pending accepts is completely controlled by the ITransportThrottleif it exists.

Edit: Changed IServiceThrottle to ITransportThrottle

Design affected-very-few area-servers enhancement severity-nice-to-have yarp 馃 Bedrock

Most helpful comment

We have a configurable timeout which limits how long a connection is allowed to take to perform the handshake. net.tcp isn't an internet protocol, it's a LAN or WAN protocol so you are running inside a trusted network. If you have a malicious host in your LAN, it's already game over as it can do things like MAC address spoofing, mess with BGP routing, mess with DHCP to do MITM attacks, perform DNS poisoning, passively snoop on the network to pick up any unencrypted data and 1001 other ways to screw with your network.

If you must expose a web service to a hostile environment, you generally run it behind a LB or firewall which does things like limit the number of concurrent requests from a single IP and other DoS attack mechanisms. At the end of the day, if a web service is exposed to malicious clients, the only way to prevent DoS attacks is to run at scale and/or have a smart front-end like a LB which is specialized in detecting attack patterns and preventing malicious clients from getting through to the server. But these generally aren't the problem domain of WCF's net.tcp protocol.

WCF server in .NET Framework has customers with significantly more than 150,000 clients connected to a single server. When that WCF service needs to be restarted, e.g. with a software update, if you accept connections as fast as you can, you would have so many connection handshakes in flight at once that each handshake would exceed it's timeout. When the service sends some bytes as part of the handshake and then calls ReceiveAsync, the amount of time it would take for the await continuation to execute due to the amount of contention would exceed your timeout.

Even if CPU/thread contention wasn't a limitation (you are running on a 128 core system), the security process in windows (lsass) is single threaded and limited by the speed of a single core. So the security handshake bottlenecks on the single core which queues the authorization requests and when running at 100% CPU will result in exceeding the connection timeout. So the connection is killed, most likely by the client first as the timeout firing also requires a thread to run on, and the client attempts to reconnect again. So now you have the original connection attempt still in the lsass queue, which has been abandoned by the client but there's no way to cancel a pending authorization, and you throw another one into the queue. Rinse and repeat.

If you are CPU/thread contended due to too many concurrent connection handshakes, any clients which have been able to connect, their actual service requests are competing for CPU time with the connecting clients, so their requests time out. Which causes the channel to be faulted, which means the connection needs to be re-established and that client joins the party.

If you have control of the number of pending accepts, you can limit the number of concurrent handshakes which means you can guarantee (by tuning your service) you are not overly contended on completing those handshakes. Sure you will have clients being denied connecting to the server socket, but nobody in the listen backlog will timeout as you configure things high enough that you can accept new requests at a fast sustainable rate and you size your listen backlog to that no connection stays in the listen backlog for more than a few seconds. But if you If you don't have a way to control how many outstanding accepts you have, if you have enough clients connecting at once you can get in a death spiral that makes itself worse.

I know I could asynchronously wait after a socket has been accepted and not start the handshake, but then you are allocating resources for each incoming connection just to wait there. With the rate of allocation with a large number of incoming connection, you would be running a Gen0 GC quite regularly so all those waiting connections are likely to have their allocations get to at least Gen1 before the connection times out. So now you are churning the GC heavily by having them waiting there. You also have a lot of timers registered for expiring the wait before starting the handshake. You also run the risk of a connection making it to the handshake with very little time left on it's timer and being aborted when part way through the handshake as the timer starts when you accept a connection. It's better to distribute the timers needed on the client with a retry backoff algorithm than have timer contention on the server.

Overall, without this capability, there are some existing large scale WCF workloads that an AspNetCore based WCF solution just wouldn't be able to handle without Accept throttling. WCF is used a lot in enterprises so we can have the start of the workday connection spike where everyone sits down at their desk at about the same time and logs into an app all at once.

All 19 comments

:+1:

Note that backpressure exists at the OS level. This is the parameter that is passed to the Socket.Listen call.

I wonder if it makes sense to make IThrottle explicit part of the bedrock abstractions. For example, a parameter on the IConnectionListener.Bind.

CC @davidfowl @ReubenBond @halter73

I see the value in this existing but I don't think it belongs on any of the existing APIs.

Is the IConnectionListener meant to call AquireAsync or is it some other component?

I can't see how that can be enforced at this point. I would assume it's the caller.

@tmds, the backlog argument to Socket.Listen isn't back pressure. That's a buffer to accommodate bursts of incoming connections that the app isn't calling Accept fast enough to handle. If you have 0 listen backlog configured, then the OS will deny any incoming connections if there isn't an available pending Accept. That isn't what I need solved. When you are CPU contended, the OS gives each thread a quanta of time to execute before preemptively switching out to another thread to execute. During the time that a thread has the CPU, it can issue multiple asynchronous Accept calls on a listening socket before it yields the thread. Those accepts complete on IO Threads which have a different scheduling priority and the preemptive behavior is very different in the OS. So basically you can have a CPU pegged at 100% and "worker" threads will be preempted by IO Threads which handle the completed Accept and (hopefully) dispatch calling the handler on a worker thread. You should only be calling Accept on an IO thread (because Windows will cancel the pending IO if you close the thread so calling Accept on a worker thread causes the CLR to keep more threads alive in the thread pool so as not to cancel pending IO). So unless you have a mechanism to prevent new pending Accept calls from being made, you can get into the situation where the OS scheduling of IO threads results in you adding more work to your worker thread pool and you end up in the situation where you reduce performance because of more context switches, and any one request takes longer which can result in client timeouts which results in resubmitting of work and you can get into a death spiral.

@davidfowl, I'm not suggesting adding this to an existing API. I'm proposing a new interface API for throttling and suggesting a way how the connection listener (I presume the IConnectionListener that @tmds mentioned) can use that API to allow an application to prevent the death spiral that you can get into without it. This is a very real scenario and not just academic as I have worked with customers who have had this happen when they have set their WCF transport limits too liberally.

@davidfowl, I'm not suggesting adding this to an existing API. I'm proposing a new interface API for throttling and suggesting a way how the connection listener (I presume the IConnectionListener that @tmds mentioned) can use that API to allow an application to prevent the death spiral that you can get into without it.

Sounds like an implementation detail.

@tmds, the backlog argument to Socket.Listen isn't back pressure. That's a buffer to accommodate bursts of incoming connections that the app isn't calling Accept fast enough to handle.

In both cases it's implementing a limited queue. I'm not sure when it fits the name backpressure and when not.

For example, if the maximum number of pending connections is configured to be 100 and there are currently 95 pending connections, we limit the number of pending accepts to be no more than 5.

How do you prevent this being used to do some attack in the form: make a nr of connections that perform a handshake in the slowest way possible? This stops real clients from making connections.

We have a configurable timeout which limits how long a connection is allowed to take to perform the handshake. net.tcp isn't an internet protocol, it's a LAN or WAN protocol so you are running inside a trusted network. If you have a malicious host in your LAN, it's already game over as it can do things like MAC address spoofing, mess with BGP routing, mess with DHCP to do MITM attacks, perform DNS poisoning, passively snoop on the network to pick up any unencrypted data and 1001 other ways to screw with your network.

If you must expose a web service to a hostile environment, you generally run it behind a LB or firewall which does things like limit the number of concurrent requests from a single IP and other DoS attack mechanisms. At the end of the day, if a web service is exposed to malicious clients, the only way to prevent DoS attacks is to run at scale and/or have a smart front-end like a LB which is specialized in detecting attack patterns and preventing malicious clients from getting through to the server. But these generally aren't the problem domain of WCF's net.tcp protocol.

WCF server in .NET Framework has customers with significantly more than 150,000 clients connected to a single server. When that WCF service needs to be restarted, e.g. with a software update, if you accept connections as fast as you can, you would have so many connection handshakes in flight at once that each handshake would exceed it's timeout. When the service sends some bytes as part of the handshake and then calls ReceiveAsync, the amount of time it would take for the await continuation to execute due to the amount of contention would exceed your timeout.

Even if CPU/thread contention wasn't a limitation (you are running on a 128 core system), the security process in windows (lsass) is single threaded and limited by the speed of a single core. So the security handshake bottlenecks on the single core which queues the authorization requests and when running at 100% CPU will result in exceeding the connection timeout. So the connection is killed, most likely by the client first as the timeout firing also requires a thread to run on, and the client attempts to reconnect again. So now you have the original connection attempt still in the lsass queue, which has been abandoned by the client but there's no way to cancel a pending authorization, and you throw another one into the queue. Rinse and repeat.

If you are CPU/thread contended due to too many concurrent connection handshakes, any clients which have been able to connect, their actual service requests are competing for CPU time with the connecting clients, so their requests time out. Which causes the channel to be faulted, which means the connection needs to be re-established and that client joins the party.

If you have control of the number of pending accepts, you can limit the number of concurrent handshakes which means you can guarantee (by tuning your service) you are not overly contended on completing those handshakes. Sure you will have clients being denied connecting to the server socket, but nobody in the listen backlog will timeout as you configure things high enough that you can accept new requests at a fast sustainable rate and you size your listen backlog to that no connection stays in the listen backlog for more than a few seconds. But if you If you don't have a way to control how many outstanding accepts you have, if you have enough clients connecting at once you can get in a death spiral that makes itself worse.

I know I could asynchronously wait after a socket has been accepted and not start the handshake, but then you are allocating resources for each incoming connection just to wait there. With the rate of allocation with a large number of incoming connection, you would be running a Gen0 GC quite regularly so all those waiting connections are likely to have their allocations get to at least Gen1 before the connection times out. So now you are churning the GC heavily by having them waiting there. You also have a lot of timers registered for expiring the wait before starting the handshake. You also run the risk of a connection making it to the handshake with very little time left on it's timer and being aborted when part way through the handshake as the timer starts when you accept a connection. It's better to distribute the timers needed on the client with a retry backoff algorithm than have timer contention on the server.

Overall, without this capability, there are some existing large scale WCF workloads that an AspNetCore based WCF solution just wouldn't be able to handle without Accept throttling. WCF is used a lot in enterprises so we can have the start of the workday connection spike where everyone sits down at their desk at about the same time and logs into an app all at once.

@mconnew thanks for the interesting and detailed response. So this is to maintain performance with high nr of concurrent accepts, and not for security.

@tmds, correct, this is about scalability. I can't think of a scenario where this would affect regular HTTP but it could affect WebSockets or gRPC. Especially if using something like token based authentication.

I'm going to look into a design for this. It comes up because the ConnectionDelegate is a push model and the server is accepting connections and dispatching.

@davidfowl at what point do you consider a connection _handled enough_ to allow a new accept?

That鈥檚 policy that can be implemented

From discussing with @mconnew I think policies will be app specific, or would there be some that come with AspNetCore?

We might have a clean design for this and it looks like an IConnectionListener implementation that wraps the transport is the way to go here. It would look like any other connection middleware:

```C#
options.ListenAnyIP(5000, listenOptions =>
{
listenOptions.UseConnectionThrottling(); // This is the throttling middleware
listenOptions.UseHttps(); // This is the existing HTTPS middleware
});

This works by exposing a new method on `ListenOptions`:

```C#
public class ListenOptions
{
    public ListenOptions UseListenerFilter(Func<IConnectionListener, IConnectionListener> middleware);
}

This would allow you to wrap the previous IConnectionListener in order to handle the calls to accept. That's where this logic can be plugged in.

@halter73 spiked something very similar today (for different reasons) but I think this approach has legs and doesn't introduce new concepts.

Thanks for contacting us.
We're moving this issue to the Next sprint planning milestone for future evaluation / consideration. We will evaluate the request when we are planning the work for the next milestone. To learn more about what to expect next and how this issue will be handled you can read more about our triage process here.

:eyes:

Does this issue account for throttling policies that rely on SNI host name? YARP scenarios would benefit from being able to apply throttling bucketed by host names.

This can already be done using something like TlsFrameHelper.GetServerName() (see https://github.com/microsoft/reverse-proxy/pull/262) in Kestrel connection middleware. Connections need to be accepted before SNI-based throttling can take place in order to determine the requested server name.

@mconnew wants a mechanism to throttle Kestrel's accept loop. Kestrel provides no mechanism to do this yet. Connection middleware cannot do this because connections are accepted before Kestrel invokes the connection middleware pipeline.

Was this page helpful?
0 / 5 - 0 ratings