Runtime: Singleton HttpClient doesn't respect DNS changes

Created on 29 Aug 2016  路  77Comments  路  Source: dotnet/runtime

As described in the following post: http://byterot.blogspot.co.uk/2016/07/singleton-httpclient-dns.html, once you start keeping a shared HttpClient instance to improve performance, you'll hit an issue where the client won't respect DNS record updates in case of failover scenarios.

The underlying issue is that the default value of ConnectionLeaseTimeout is set to -1, infinite. It'll only close on dispose of the client, which is very inefficient.

The fix is to update the value on the service point like this:

var sp = ServicePointManager.FindServicePoint(new Uri("http://foo.bar/baz/123?a=ab"));
sp.ConnectionLeaseTimeout = 60*1000; // 1 minute

Unfortunately, there's no way to do this with .NET Core today.

Either ServicePointManager should be brought over to .NET Core or similar equivalent functionality should be enabled some other way.

area-System.Net.Http enhancement

Most helpful comment

This is absolutely an issue. The proposed solution by @darrelmiller seems to demonstrate a lack of understanding of the use case.

We use Azure Traffic Managers to evaluate the health of an instance. They work on top of DNS with short TTLs. A host doesn't magically know it is unhealthy and start issuing Connection:close headers. The traffic manager simply detects the unhealthy state and adjusts DNS resolution accordingly. By design this is COMPLETELY TRANSPARENT to clients and hosts... as it stands, an application with a static HttpClient resolving DNS (unknowingly) by a traffic manager will never receive a new host when the previously resolved host becomes unhealthy.

As of netcore 1.1 I'm currently searching for a resolution to this exact problem. I don't see one, nativly anyway.

All 77 comments

I don't think that's the same property. That's the connection timeout, not the ConnectionLeaseTimeout.

As documented in https://msdn.microsoft.com/en-us/library/system.net.servicepoint.connectionleasetimeout.aspx, it should be dropped periodically in some scenarios. But there's no apparent way to change the behavior on .NET Core.

@onovotny Yeah that's a weird thing that seems to do an explicit connection:close after a certain time frame. There is no reason that couldn't be implemented as a piece of middleware if you really wanted to. It's just a HTTP header.

@darrelmiller so that's getting out of my area, not sure... but if there's a documented workaround that addresses the original issue, that would be a huge benefit.

@onovotny, there is no ServicePointManager class in .NET Core. This sounds like a .NET Framework (Desktop) issue. Please open up an issue in UserVoice or Connect.

@onovotny The original issue, as I understand it is that, on the server side, someone wants to update a DNS entry so all future requests to host A go to new IP address. The best solution in my opinion is when that DNS entry is changed, the application on host A should be told to return a Connection:close header to the client on all future responses. That will force all clients to drop the connection and establish a new one. The solution is not to have a timeout value on the client so the client sends a Connection:close when that "lease" timeout expires.

This jives well with the question in Stackoverflow Unit testing DNS failover, using a custom DNS resolver and may give further context around this issue, so I'll cross-link (I'll remove here if inappropriate).

This is absolutely an issue. The proposed solution by @darrelmiller seems to demonstrate a lack of understanding of the use case.

We use Azure Traffic Managers to evaluate the health of an instance. They work on top of DNS with short TTLs. A host doesn't magically know it is unhealthy and start issuing Connection:close headers. The traffic manager simply detects the unhealthy state and adjusts DNS resolution accordingly. By design this is COMPLETELY TRANSPARENT to clients and hosts... as it stands, an application with a static HttpClient resolving DNS (unknowingly) by a traffic manager will never receive a new host when the previously resolved host becomes unhealthy.

As of netcore 1.1 I'm currently searching for a resolution to this exact problem. I don't see one, nativly anyway.

@kudoz83 You are correct, a host doesn't magically know when a traffic manager has adjusted the DNS resolution, hence why I said

when that DNS entry is changed, the application on host A should be told

If you don't believe that it is practical to notify an origin server that it should return connection:close headers then you are left with just a few choices:

  • Periodically close connections from a client or middleware and pay the cost of re-opening those connections redundantly
  • Have the client poll a service to know when it should reset connections.
  • Use a piece of middleware that is smart enough to know when the DNS has been switched.

And regarding me not understanding the use case. The original issue was based on a blog post where the use case was related to switching between production and staging slots and had nothing to do with health monitoring. Although both scenarios could benefit from being able to receive notifications of DNS/slot switches.

AFAIK there is not a way to notify a failed host that it is failed? Generally this would mean it is in some kind of error state. The original post is talking about failover scenarios.

Azure Traffic Manager is explained here https://docs.microsoft.com/en-us/azure/traffic-manager/traffic-manager-monitoring

It operates at the DNS level and is transparent to clients and hosts - by design. Azure instances communicating with one another via Traffic Manager for DNS resolution are in a pickle without being able to set a TTL on DNS refresh for an HttpClient.

The way HttpClient currently operates appears to be at odds with Microsoft's own Azure service offerings - which is the entire point. Not to mention, any similar service (E.G. Amazon Route 53) works exactly the same way, and has the exact same short DNS TTL dependency.

Obviously I found this thread because I'm impacted by the issue. I'm simply trying to clarify that there doesn't appear to be an ideal workaround currently other than just arbitrarily recreating HttpClients (at the cost of performance and complexity) for the sake of making sure DNS failover succeeds.

AFAIK there is not a way to notify a failed host that it is failed?

If a "failed" host could not execute any code then we wouldn't need the HTTP status code 503 because an origin server would never be capable of returning it. There are scenarios where a failed host may not be capable of processing a notification, but in those cases it probably isn't going to hold an active TCP/IP connection open for very long either.

When Ali originally brought this topic up on Twitter, before he wrote the blog post, he was concerted about the fact that a client would continue making requests and getting responses from a server that had been swapped out.

I understand your situation is different and that you cannot rely on an origin server being able to reliably close a connection. What I'm not sure I understand is why for your situation, you can't use a HttpMessageHandler to detect 5XX responses, close the connection and retry the request. Or even simpler, just convince your web server to return connection:close whenever it returns a 5XX status code.

It's also worth noting that the behavior you are concerned isn't in HttpClient. It is in the underlying HttpHandler or even below that, and there are a number of different implementations of HttpHandler in use. In my opinion, the current behavior is consistent with the way HTTP is designed to work. I agree there is a challenge when DNS settings are updated while connections are open, but this isn't a .net problem. This is a HTTP architecture problem. That doesn't mean .net couldn't do something to make this less of a problem.

I curious as to what you believe the solution should be. Do you think something like the ConnectionLeaseTimeout should be re-implemented that simply drops a connection periodically regardless of whether there has been a failover or not? Or do you have a better solution?

Sorry,

I Shouldn't have come off as combative as I sounded yesterday. Was having a bad day.

Honestly I'm not aware how ConnectionLeaseTimeout works beneath the covers. The desired functionality would be for HttpClient to be configurable so as to perform a new DNS lookup, prior to establishing new connections as opposed to caching the previous DNS lookup until the client is disposed. I don't believe this requires killing an existing connections...

@kudoz83 No worries, we all have days like that :-)

From what I understand ConnectionLeaseTimeout is literally a timer that periodically closes an idle connection from the client side. It is implemented in ServicePointManager, which doesn't exist in core. This is different than the "idle connection" timeout which only closes a connection once it has not be used for a certain period of time.

There is likely a Win32 call that would allow flushing the DNS client cache. That would ensure new connections do a DNS lookup. I took a quick look for it but didn't find it, but I'm sure it is there somewhere.

In .net core on Windows, the management of HTTP connections is actually left to the OS. The closest you get is the call to the Win32 API WinHttpOpen https://github.com/dotnet/corefx/blob/master/src/System.Net.Http.WinHttpHandler/src/System/Net/Http/WinHttpHandler.cs#L718 There are no public APIs that I can find to do anything with the _sessionHandle. To get the kind of behavior change you are looking for would require the OS folks to make changes. At least that's how I understand it.

Wouldn't it be easier to get Azure Traffic Manager just to set a very low TTL on the DNS entries? If you aren't concerned about long lived connections, then a low TTL should minimize the chance of new connections being setup against a dead server.

The upside of the way WinHttpHandler is built is that we get things like HTTP/2 for free with the OS update. The downside is we don't get a whole lot of control in managed code.

@darrelmiller

[ConnectionLeaseTimeout] is implemented in ServicePointManager, which doesn't exist in core.

It seems that ServicePoint.ConnectionLeaseTimeout is coming back in .Net Standard 2.0/.Net Core 1.2. (Though I have no idea if e. g. WinHttpHandler connections will adhere to it.)

I think there's some miscommunication here: DNS TTLs are completely unrelated to HTTP/TCP connection lifetimes.

New HTTP connections do a DNS lookup and connect to the returned host address. Multiple HTTP requests can be made over the same single connection but once opened the HTTP connection (and the underlying TCP connection) does not do a DNS lookup again.

This is correct behavior so even if you make 1000's of requests over a connection that lasts for days, the DNS resolution doesn't matter since the connection hasn't changed. If you make a new HTTP request over a new HTTP connection then you will see a DNS lookup for the new connection.

This DNS refresh has nothing to do with HttpClient, which just uses HttpClientHandler by default (which itself uses the HttpWebRequest sockets stack) and any DNS resolution is handled by that stack. ServicePointManager manages ServicePoint objects in that stack which are connections to a specific host. ServicePointManager does have some minimum client-side caching of DNS lookups which you can tweak.

Default settings are meant to have connections stay open indefinitely for efficiency. If the connected host becomes unavailable then the connection will be broken automatically, causing a new connection with a new DNS lookup, however DNS TTLs will not cause any new connections. If you need that, you'll have to code that into your app.

If you want another DNS lookup to be done, you need to reset the connection on the client. You can force a max-lifetime of the TCP connection, send a connection: close header or just dispose the HttpClient or the underlying ServicePoint. You can also do the DNS resolution with Dns.GetHostAddresses and use the IP address instead for your HTTP requests.

Hope that helps, if I'm off on this then please let me know.

@manigandham

I think there's some miscommunication here: DNS TTLs are completely unrelated to HTTP/TCP connection lifetimes.

Yes DNS TTLs are fairly unrelated, but not completely.

New HTTP connections do a DNS lookup and connect to the returned host address.

Yes, that is correct. But Windows has a聽local DNS cache that respects the TTL. If the TTL is high and the DNS server has changed the DNS聽record, then the new connection will be opened against the old聽IP address due to client caching of the DNS record. Flushing the client DNS cache is the only workaround to this that I am aware of.

Multiple HTTP requests can be made over the same single connection but once opened the HTTP connection (and the underlying TCP connection) does not do a DNS lookup again.

Correct again. 聽In a production/staging swap where the swapped out server is still responding, this will cause聽many future requests to go to the old server, which is bad. However, the most recent conversation has been about the scenario where a server fails and the Traffic Manager swaps to a new server. What happens to the existing connection to the failed server聽is unknown. If it is a severe hardware failure, then there is a good chance the connection will be dropped. That would force the client to re-establish a connection and that is where a low TTL is useful. If however, the server failure聽doesn't break the server and it simply returns 5XX responses then it would be useful for the server to return connection:close to ensure that future requests will be made on a new connection, hopefully to the new working server

This DNS refresh has nothing to do with HttpClient, which just uses HttpClientHandler by default (which itself uses the HttpWebRequest sockets stack)

Yes and no. On .Net core, the HttpClientHandler聽has been re-written and no-longer uses HttpWebRequest and therefore doesn't use ServicePointManager. Hence the reason for this issue.

I took a closer look at IIS and there doesn't appear to be a way to get聽5XX responses to come with connection:close headers. It would be necessary to implement a HttpModule to do that magic.

As an FYI, here's a related post covering this issue and an interim solution. Beware of HttpClient

Either ServicePointManager should be brought over to .NET Core

While we have brought ServicePointManager and HttpWebRequest for platform parity, it doesn't support most of the functionality in .Net Core as the underlying stack resolves to either WinHTTP or Curl which do not expose the advanced control knobs.

Using ServicePointManager to control HttpClient is actually a side-effect on using HttpWebRequest and the managed HTTP stack on .Net Framework. Since WinHttpHandler/CurlHandler are used on .Net Core, SPM has no control over HttpClient instances.

or similar equivalent functionality should be enabled some other way.

Here's WinHTTP team's answer:

  1. For security reasons, while the client keeps sending requests and _there are active connections_, WinHTTP will continue to use the destination IP. There is no way to limit the time these connections remain active.
  2. If all connections to the remote endpoint were closed by the server, _new_ connections will be made querying DNS and thus towards the new IP.

To force clients to the next IP, the only viable solution is ensuring that the server rejects new connections and closes existing ones (already mentioned on this thread).
This is the same with what I understand from the ATM documentation: detection is made based on GET requests failing 4 times (non-200 responses) in which case the connections will be closed by the server.

Given other network appliance devices that are themselves caching DNS (e.g. home routers/APs) I wouldn't recommend DNS fail-over as a solution for low-latency, high availability technique. If that is required, my recommendation for a controlled fail-over would be having a dedicated LB (hardware or software) that knows how to properly drain-stop.

as it stands, an application with a static HttpClient resolving DNS (unknowingly) by a traffic manager will never receive a new host when the previously resolved host becomes unhealthy.

As mentioned above, if by unhealthy you mean that the server returns a HTTP error code and closes the TCP connection while you observe the DNS TTL expired, this is indeed a bug.

Please describe your scenario in more detail:

  1. During the failover, what is the client-machine's DNS cache and TTLs. You can use ipconfig /showdns and nslookup -type=soa <domain_name> to compare what the machine thinks the TTL is and the authoritative NS's TTLs.
  2. Are there active connections towards the remote server? You can use netstat /a /n /b to see the connections and processes owning them.
  3. Creating a repro would help us a lot: please attach any available info such as: version of .Net Core/Framework, code for client/server, network traces, remote DNS LB/Traffic Manager involved, etc.

If you connect to a machine and the TCP/HTTP connection continues to work, why should you at some point assume the server is no longer appropriate.
I think there is room for improvement on the Azure side. For example, when machines are unhealthy or swapping environments, existing connections can be closed.

or just dispose the ... underlying ServicePoint

@manigandham How? It is not disposable and the only relevant method I can see is CloseConnectionGroup which you can't really use as the group name is an implementation detail (currently some hash of the handler). Also, I'm not sure what would happen if you tried disposing the ServicePoint while the HttpClient was using it (especially concurrently).

@ohadschn

.NET Core 2.0 brings back the ServicePoint classes and functionality: https://docs.microsoft.com/en-us/dotnet/api/system.net.servicepoint?view=netcore-2.0

You can go back to using the connection lease timeout to set a max timeframe for active connections before they get reset.

Nothing exotic about closing the connection while HttpClient request is in progress, it would be the same as you losing connectivity when browsing the internet. Either the request never makes it or it does and you don't get the response back - both should lead to http errors returned and you'll have to handle them in your application.

With the connection lease timeout setting you can ensure this is done only after a request is sent and reset before the next one over the time limit.

.NET Core 2.0 brings back the ServicePoint classes and functionality: https://docs.microsoft.com/en-us/dotnet/api/system.net.servicepoint?view=netcore-2.0
You can go back to using the connection lease timeout to set a max timeframe for active connections before they get reset.

Actually, it doesn't bring back the functionality completely. The API surface came back. But it is basically a no-op since the HTTP stacks don't use that object model.

cc: @stephentoub @geoffkizer

I'd consider many of the participants in this thread to be something of a "dream team" in this subject matter, yet I'm still not sure if any consensus has been reached on what the best practices are given what we have today. It would be great to get a decisive answer from this group on a couple key points.

Let's say I have a general purpose HTTP library targeting many platforms, not all of which support ServicePointManager, and I'm attempting provide behavior that manages HttpClient instances smartly, per best practices, by default. I have no idea ahead of time what hosts will be called, let alone whether they're well behaved with respect to DNS changes/Connection:close headers.

  1. How do I optimally manage HttpClient instances? One per endpoint? One per host/schema/port? Literally one singleton total? (I doubt that, since consumers may want to take advantage of default headers, etc.)

  2. How would I most effectively deal with this DNS issue? Dispose instances after 1 minute? Treat it as "highly unlikely" and don't provide any disposal behavior by default?

(Not a hypothetical situation btw; I'm wrestling with these exact questions right now.)

Thanks!

I wouldn't even be invited to the dream team tryouts, but I do have one insight regarding your singleton HttpClient comment. First, you can always use the SendAsync method to specify different headers and addresses, so you could wrap that with your class that provides shared data similar to HttpClient (e.g. default headers, base address) yet ends up calling the same single client. But even better, you can create multiple HttpClient instances who all share the same HttpClientHandler: https://github.com/dotnet/corefx/issues/5811.

As for the DNS issue, the consensus here seems to be that:

To force clients to the next IP, the only viable solution is ensuring that the server rejects new connections and closes existing ones

If this is not viable for you, Darrel Miller suggested some more options here: https://github.com/dotnet/corefx/issues/11224#issuecomment-270498159.

@ohadschn Thanks, and hopefully I'm not getting too off topic, but is there any _benefit_ at all to sharing the same HttpClient (or HttpClientHandler) between multiple hosts? I could be wrong but it seems to me that a new host means new DNS lookup, new connection, new socket. Can a socket in TIME_WAIT be reclaimed and re-purposed for a different host? I had assumed not, but we're definitely reaching the limits of my expertise. If it can, perhaps this is the benefit of using an HttpClient singleton even with multiple hosts?

Regarding the DNS advice, I felt like that's where consensus was lacking most, and understandably so. Ensuring the server behaves correctly isn't viable if you don't have control of what's happening on that end. I _could_ design something where instances are disposed periodically. The question is whether that's worth the trade-off in the majority of cases. I guess that's largely up to me to make a judgement call. :)

To me the most obvious benefit to sharing the same HttpClient[Handler] is simplicity. You won't have to keep things like mapping between hosts and clients. I wouldn't know about the TIME_WAIT stuff, really not my expertise either.

Regarding DNS, you mentioned one possible approach when you don't control the server. You could also implement a mechanism that periodically queries the DNS and disposes instances only if an actual change occurred...

While the given answers that specify what should happen on the server-side are accurate for an ideal world, in the real world developers often do not have control of the full server-side stack, that they are communicating with. In that regard I really like the concept of a ConnectionLeaseTimeout, as it is a concession to the realities of the daily life of John and Jane Doeveloper.
Someone suggested writing middleware to enforce such a timeout on .NET Core. Would this class based on NodaTime and System.Collections.Immutable do the job?

    public class ConnectionLeaseTimeoutHandler : DelegatingHandler
    {
        private static string GetConnectionKey(Uri requestUri) => $"{requestUri.Scheme}://{requestUri.Host}:{requestUri.Port}";

        private readonly IClock clock;
        private readonly Duration leaseTimeout;
        private ImmutableDictionary<string, Instant> connectionLeases = ImmutableDictionary<string, Instant>.Empty;

        public ConnectionLeaseTimeoutHandler(HttpMessageHandler innerHandler, IClock clock, Duration leaseTimeout)
            : base(innerHandler)
        {
            this.clock = clock;
            this.leaseTimeout = leaseTimeout;
        }

        protected override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
        {
            string key = GetConnectionKey(request.RequestUri);
            Instant now = clock.GetCurrentInstant();

            Instant leaseStart = ImmutableInterlocked.GetOrAdd(ref connectionLeases, key, now);

            if (now - leaseStart > leaseTimeout)
            {
                request.Headers.ConnectionClose = true;
                ImmutableInterlocked.TryRemove(ref connectionLeases, key, out leaseStart);
            }

            return base.SendAsync(request, cancellationToken);
        }
    }

@snboisen I think that looks right but don't hold me to that. :)

For me the bigger question is whether sending a Connection: close header is even a viable solution. It's worth repeating here what @darrelmiller said in the comments of the blog post that prompted this thread:

ConnectionLeaseTimeout is a strange beast, that I wasn't aware of until Oren mentioned it today. It will cause outgoing requests to include Connection:Close headers after the lease expires, causing the connection to be terminated after the server sends its response. It makes sense why this is set to infinite to default because it is a strange way to force connections to be closed when a client is actively making requests.

I respect Darrel's opinion and it makes this whole approach seem rather hackish. On the other hand, I agree that what's a developer to do who doesn't have control of what's happening on the server side? If it works _sometimes_, maybe it is the best option we have? It's certainly "cheaper" than periodically disposing HttpClients or periodically doing DSN lookups.

Is there an Azure ticket open for this issue?
From my perspective it looks you are all trying to workaround the way this is implemented in Azure's traffic manager.
I'm not an Azure user, so I look at it from a distance, feeling less inclined to jump through its hoops.

@tmds I haven't seen one, but as others have mentioned, this relates to more than just Azure Traffic Manager: AWS has a similar concept and there's Azure App Service with its staging/production slot swapping. And I bet other cloud providers also offer something similar to this swapping mechanism.

@snboisen Are you saying AWS and other cloud providers (which?) are also using DNS timeouts to implement these features?

@tmds For AWS, yes, at least for failover: https://aws.amazon.com/route53/faqs/#adjust_ttl_to_use_failover
I don't know about other cloud providers, but it seems likely that some do as well.

@tmds For AWS, yes, at least for failover: https://aws.amazon.com/route53/faqs/#adjust_ttl_to_use_failover
I don't know about other cloud providers, but it seems likely that some do as well.

There are two problematic scenarios here:

  • client is connected to a server that is no longer there
  • client is connected to a server that is now running an out-dated version of the software

DNS failover is related to the first. It's appropriate to remove unreachable servers from DNS (in AWS, Azure, ...).
The proper way to detect it at the client is a timeout mechanism. I'm not sure if there is such a property on HttpClient (maximum reply timeout on an established connection).
At the protocol level, the newer websocket and http2 protocols define ping/pong messages to check connection liveness.

When I suggest to open an issue with Azure, it's for the second scenario.
There is no reason a client should close a working connection. If the server is no longer appropriate, clients should be disconnected and the server must be made unreachable.
I found a nice doc on AWS Blue/Green deployments: https://d0.awsstatic.com/whitepapers/AWS_Blue_Green_Deployments.pdf.
It lists several techniques. For those that use DNS it mentions _DNS TTL complexities_. For those that use a load balancer _no DNS complexities_.

@tmds Thanks for the link, that's very interesting.

I do wonder how common it is to rely on DNS TTL vs using a load balancer. The latter seems like a much more reliable solution in general.

Experienced some similar behavior when working with 2 different API Apps in Azure App Services that communicated via HttpClient. We noticed around 250 requests into our load tests, we'd start getting timeouts and "A connection could not be established" HttpRequestException - It didn't matter if it was concurrent users, or sequential requests.

When we switched to a using block to ensure the HttpClient is disposed in each request, we didn't receive any of these exceptions. The performance hit, if any, was negligible from a Singleton to the using syntax.

When we switched to a using block to ensure the HttpClient is disposed in each request, we didn't receive any of these exceptions. The performance hit, if any, was negligible from a Singleton to the using syntax.

It's not necessarily about performance, it's about not leaking TCP connections and getting SocketExceptions.

From MSDN:

HttpClient is intended to be instantiated once and re-used throughout the life of an application. Especially in server applications, creating a new HttpClient instance for every request will exhaust the number of sockets available under heavy loads. This will result in SocketException errors.

Also, see https://aspnetmonsters.com/2016/08/2016-08-27-httpclientwrong/

@khellang Thanks for the link (and the clarification on the SocketExceptions, I definitely had a misunderstanding there)- I've come across that post and 3 or 4 other posts outlining this exact same thing.

I understand that a single-instantiation is the intention, but when we're using it in Azure App Services, we are consistently hitting HttpRequestExceptions under a pretty modest load (mentioned above, 250 sequential requests). Unless there is another solution, disposing the HttpClient is the workaround we had to implement in order to avoid running into these consistent exceptions.

I'm assuming a "more correct" solution would be to catch these exceptions, cancel all pending requests explicitly dispose the HttpClient, re-instantiate the HttpClient, and retry all of the requests.

Seems like a bit of a mess for something that

is intended to be instantiated once and re-used throughout the life of the application

We'd love a cleaner solution, but it seems like it's not something wrong with our code, rather a limitation with the framework. I'm more than happy to be wrong, I just need some guidance on how we can use this as intended and not encounter these issues.

@snboisen I tested your ConnectionLeaseTimeoutHandler and I find that it doesn't work with multiple connections. A "close" is issued to only one connection each time the lease timeout occurs, but there may be multiple connections open with a single HttpClient instance when sending requests in parallel. From what I see in fiddler (I use a bad IP address to generate errors to check the DNS switch) after "close" is sent some requests use the new IP and some requests use the old IP. In my case I have a connection limit of 2 and it's taking several iterations of the connection lease timeout to get both connections refreshed. It would obviously be much worse with more concurrent connections from a production server, but it would probably eventually switch over after some time.

I don't know of a way to issue "close" to every individual connection in use by an HttpClient instance. That would certainly be ideal, but what I'm going to do is use a GetOrCreate() factory method for HttpClient and creating a new HttpClient instance every 120 seconds to ensure all new connections are created. I'm just not sure how to go about disposing of the old HttpClient instance once all requests are finished to close the connections, but I think .NET takes care of that because it's never been an issue creating a new HttpClient instance per request.

As a side note, I believe the refresh has to occur on the client. The server can't be responsible for communicating DNS changes nor any middleware. The purpose of DNS is to offset such responsibility so you can change IP's anytime with no coordination between a client and server and not just to provide a friendly name for an IP. The perfect use case is an AWS Elastic IP address. The change is only reflected in DNS. AWS is not going to know to send "close" headers from your web server running on an ec2 instance. AWS isn't going to be aware of every web server sitting atop of an ec2 instance which they manage. Scenario's for the old server not sending over a "close" header is new server deployments where the exact same software is being deployed to a new server with all traffic switched over by DNS. Nothing should change on the old server, especially if you have to switch back. Everything should be changed smoothly in one place with DNS.

Regardless of the bigger issues discussed here, is ConnectionLeaseTimeout now implemented in core? And if so, why is this issue not closed? I was able to code an example exactly like the original post with no issues just now. Whether that solves my connection DNS issues or not, it _seems_ ConnectionLeaseTimeout _is_ implemented in core.

ServicePointMananager.ConnectionLeaseTimeout is not implemented in .NET Core. Using this property in .NET Core is a no-op. It is functional only in .NET Framework. And by default, this property is set to "off" in .NET Framework.

I understand that a single-instantiation is the intention, but when we're using it in Azure App Services, we are consistently hitting HttpRequestExceptions under a pretty modest load (mentioned above, 250 sequential requests). Unless there is another solution, disposing the HttpClient is the workaround we had to implement in order to avoid running into these consistent exceptions.

We have been successful in remedying the socket exhaustion issue by wrapping HttpClient in a custom API that allows every http request in an application to re-use the same HttpClient instance (it does more than just this, like RestSharp but this is one of the benefits too). Retrieving this is behind a lock{} because every 60 minutes we create a new HttpClient and pass that one back instead. This has allowed us to do high volume http requests from our web application (due to heavy integrations we have) while still not re-using old connections that have not been closed. We've been doing this for about a year now with success (after hitting the socket exhaustion issues).

I'm sure there are still performance implication in this (especially due to all the locking) but it's better than everything dying due to socket exhaustion.

@KallDrexx why do you need to lock the 'global' instance? Naively I would store it in static and just replace it every 60 minutes with a new one. Would that not be enough?

Our code to get an HttpClient is:

```c#
var timeSinceCreated = DateTime.UtcNow - _lastCreatedAt;
if (timeSinceCreated.TotalSeconds > SecondsToRecreateClient)
{
lock (Padlock)
{
timeSinceCreated = DateTime.UtcNow - _lastCreatedAt;
if (timeSinceCreated.TotalSeconds > SecondsToRecreateClient)
{
_currentClient = new HttpClient();
_lastCreatedAt = DateTime.UtcNow;
}
}
}

        return _currentClient;

```

So I guess the the lock only occurs once per minute so that multiple threads aren't trying to create a new HttpApiClient simultaneously. I thought we were locking more aggressively.

I see, so it is a convenience reason.
Alternatively you could either use Timer-based (async) update, or allow multiple of them being created in parallel (thread-unsafe lazy initialization style), or use different synchronization mechanism and don't block the other threads during the creation time (they could reuse the old one).

I see, so it is a convenience reason.

I don't know if I would classify it as a convienience.

If too many threads try to create a new HttpClient at the same time we still risk socket exhaustion from too many being created every minute, especially if the old ones haven't had the sockets cleaned up properly in that one minute (causing a cascade).

Furthermore, I'd still need some measure of locking because I don't know if updating a DateTime is an atomic operation (I suspect not) and therefore some threads may read the _lastCreatedAt value in the middle of an update operation.

And in order for all threads to re-use the old client while simultaneously having one thread create a new client requires a lot of complex logic so we are guaranteed that one thread does successfully create the new client. Not to mention we still need locks because we cannot gaurantee that one thread won't return _currentClient while another thread is instantiating it.

There are far too many unknowns and added complexity to risk doing that in production.

@KallDrexx

  • It seems ReaderWriterLockSlim would be a better fit for your usage.
  • DateTime assignment should be atomic if you are running true x64 (because it contains a single ulong field). Note that your current code is not thread-safe if that is not the case, because you are observing the DateTime value outside the lock.
  • The timer-based approach suggested by @karelz won't require a DateTime at all... the timer would simply switch up the HttpClient. You would still need to surround the HttpClient field with some memory barrier though (volatile, Interlocked.Exchange etc).

It seems ReaderWriterLockSlim would be a better fit for your usage.

Neat I hadn't known about that class! However, you mentioning this also made me think about also using a SemaphoreSlim, as that is async/await friendly (where it appears ReaderWriterLockSlim isn't), which may help with possible thread contention when a new HttpClient is required. Then again the ReaderWriterLockSlim will handle better the potential DateTime non-atomicy issue you rightly pointed out so I'll have to give this some thought, thanks!

The timer-based approach suggested by @karelz won't require a DateTime at all... the timer would simply switch up the HttpClient. You would still need to surround the HttpClient field with some memory barrier though (volatile, Interlocked.Exchange etc).

Ah I had misread what he meant. That does make sense, though I need to get more familiar with the memory barrier mechanisms to do it properly.

Thanks.

Edit: Just to add, this is a good reason why I think something should be built in to handle this, so every consumer of HttpClient doesn't have to re-implement/go through this.

I wonder if it is still a problem in .net core 2.0? Do we still need to worry about the DNS change?

I wonder if it is still a problem in .net core 2.0? Do we still need to worry about the DNS change?

My understanding is that it isn't an issue with Dns itself, just that HttpClient will try to re-use existing connections (to prevent socket exhaustion), and thus even when DNS changes you are still connecting to the old server because you are going over the previous connection.

@KallDrexx , so it means it will still have issues for blue-green deployment?

@withinoneyear from my understanding yes. If you create an HttpClient and utilize it against Production Green, then swap production to Blue the HttpClient will still have an open connection to Green until the connection is closed.

I know this is old, but I believe you should not use DateTime.UtcNow for time interval measurement. Use a monitonic time source like Stopwatch.GetTimestamp. This way you鈥檙e immune to local time adjustment by operator or time synchronization. There were some buggy chipsets that made QPC values go back, but I don鈥檛 believe many of them are in use now.

var sp = ServicePointManager.FindServicePoint(new Uri("http://foo.bar/baz/123?a=ab"));
sp.ConnectionLeaseTimeout = 60*1000; // 1 minute

This code is placed in the constructor?

@whynotme8998 @lundcm I think not. If you're HttpClient is singleton and you access multiple enpoints the constructor would not be the correct place to place it.
Check this implementation. Also in the README of the project there's a link to a blog post explaining the problem. That article also points to this thread.

Can someone help me understand exactly what ConnectionLeaseTimeoutmeans?

If in my App I set it for 1 minute, does that mean every minute the connection lease will timeout? What exactly does that mean for talking with my back-end?

I see people are talking about the DNS updating on the back-end? On a standard Azure Web App, would publishing my ASP.NET back-end cause a DNS refresh?

Wouldn't a 1 minute timeout potentially leave a window where the DNS updated yet the timeout still hasn't occurred yet?

Edit. Realizing this doesn't work in .NET Standard 2.0. What is the work around?

Waiting for that https://www.stevejgordon.co.uk/introduction-to-httpclientfactory-aspnetcore or implementing bits of it yourself.

Changing ConnectionLeaseTimeout did not work on Xamarin with .netstandard 2.0 !!!

@paradisehuman I assume you mean ServicePoint.ConnectionLeaseTimeout. AFAIK it does not work either on .NET Core. In .NET Core 2.1 we introduced SocketsHttpHandler.PooledConnectionLifetime. Also the new HttpClientFactory in ASP.NET can leverage both and does even more for you.
Mono/Xamarin has its own networking stack implementation (they do not consume CoreFX source code for networking) - so it would be best to file the bug on their repo with more details.

Unfortunately, there's no way to do this with .NET Core today. Either ServicePointManager should be brought over to .NET Core or similar equivalent functionality should be enabled some other way.

@onovotny, SocketsHttpHandler in .NET Core 2.1 exposes PooledConnectionLifetime, which serves a purpose similar to ConnectionLeaseTimeout, just at the handler level. Can we consider this issue addressed at this point?

@stephentoub just seeing this now, yes, I think PooledConnectionLifetime should address it. The community can open another issue if something is still needed.

@onovotny @karelz @stephentoub,

Would you all concur that this is an accurate summary of how best to solve the original singleton HttpClient/DNS problem?

  • In .NET Framework 2.0+, use ServicePoint.ConnectionLeaseTimeout.

  • In .NET Core 2.1, use SocketsHttpHandler.PooledConnectionLifetime.

  • For all other platforms/targets/versions, there is no reliable solution/hack/work-around, so don't even bother trying.

As I continue to learn things from this thread and others (such as ConnectionLeaseTimeout not working reliably across platforms in .NET Standard 2.0), I'm back to the drawing board trying to solve this from a library in a way that works on as many platforms/versions as possible. Platform-sniffing is fine. Periodically disposing/recreating HttpClient is fine, if that's effective. I'm fairly certain I got it wrong in my first attempt, which was to simply send a connection:close header to the host at regular intervals. Thanks in advance for any advice!

@tmenier For all other platforms/targets/versions, there is no reliable solution/hack/work-around, so don't even bother trying.

You can always recycle the instance on regular basis (e.g. just create new instance and set it to static variable which others use).
That's what HttpClientFactory does. You can choose to use HttpClientFactory instead.

@karelz Thank you, I think this provides me a way forward. Perhaps I should just skip the platform sniffing and recycle instances everywhere, or is advisable to use the other approaches where available (to avoid overhead, etc.)?

The tricky bit might be knowing _when_ it's safe to dispose recycled instances (I need to worry about concurrency/pending requests), but I can take a peek at how HttpClientFactory deals with that. Thanks again.

@tmenier You do not want to recycle instances every HttpClient call, as that will lead you to tcp socket exhaustion. The recycling needs to be managed (timeout or maintaining a pool with HttpClients that can be expired.

@KallDrexx Right, I'm just talking about _periodically_ recycling a singleton (or "pseudo"-singleton), such as every few minutes. I do believe it will involve managing a pool and some kind of delayed disposal of dead instances, once it's determined they have no pending requests.

Ah sorry misunderstood what you meant by recycle.

You should not need to dispose old ones. Let GC collect them.
The code should be as simple as setting new "pseudo"-singleton periodically based on timer. Nothing more.

I鈥檝e found that not disposing them can lead to slowdowns as you run out of http connections in high performance scenarios.

@tmenier About using SocketsHttpHandler.PooledConnectionLifetime in In .NET Core 2.1, the creation of HttpClientHandler in completely encapsulated in https://github.com/dotnet/wcf/blob/master/src/System.Private.ServiceModel/src/System/ServiceModel/Channels/HttpChannelFactory.cs. Is there a way to do this when using the WCF client?

@edavedian I'd suggest to ask on dotnet/wcf repo. cc @mconnew @Lxiamail

the underlying stack resolves to either WinHTTP or Curl which do not expose the advanced control knobs

Is there in .Net Core an HTTP client that works over sockets?

Is there in .Net Core an HTTP client that works over sockets?

Yes, SocketsHttpHandler, and as of 2.1 it's the default.

Does HttpClient cache URIs' it hits, similarly to what RestClient I believe does? If it does, please let me know on which version (.NET Core 3.0/.NET Framework 3.0 etc) it supports this.

I see that caching is only done in RestClient if the project is a .Net Core 2.1+ App?

Does HttpClient cache URIs' it hits, similarly to what RestClient I believe does?

HttpClient does not do any caching.

@SpiritBob it's not clear what form of caching you're referring to. This issue is closed; if you would please open a new issue with questions we can help you find answers.

@scalablecory I believe davidsh responded with what I had in mind. Thanks!

Just finished reading this thread, and it seems like there was a use case that wasn't addressed.

When using HttpClientFactory, will the resulting HttpClients automatically close connections and do a new DNS lookup when they receive an HttpRequestException with an inner SocketException indicating that the host is no longer valid and that DNS has potentially changed (error code 11001 - no such host is known)?

If not, what is the workaround to make HttpClient perform DNS lookups before the TTL expires on DNS records?

I'm currently using HttpClientFactory with a custom HttpMessageHandler (configured via dependency injection using AddHttpClient<T, U>().ConfigurePrimaryHttpMessageHandler(...)) and when a DNS record changes before the TTL expires, it errors out. Specifically, I see it happening when doing a bunch of uploads to Azure Blob Storage.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

v0l picture v0l  路  3Comments

nalywa picture nalywa  路  3Comments

jzabroski picture jzabroski  路  3Comments

omajid picture omajid  路  3Comments

EgorBo picture EgorBo  路  3Comments