Runtime: HttpClient throws TaskCanceledException on timeout

Created on 25 May 2017  ·  119Comments  ·  Source: dotnet/runtime

HttpClient is throwing a TaskCanceledException on timeout in some circumstances. This is happening for us when the server is under heavy load. We were able to work around by increasing the default timeout. The following MSDN forum thread captures the essence of the experienced issue: https://social.msdn.microsoft.com/Forums/en-US/d8d87789-0ac9-4294-84a0-91c9fa27e353/bug-in-httpclientgetasync-should-throw-webexception-not-taskcanceledexception?forum=netfxnetcom&prof=required

Thanks

area-System.Net.Http enhancement

Most helpful comment

@karelz adding a note as another customer impacted by this misleading exception :)

All 119 comments

@awithy what version of .NET Core are you using?

1.1

Whether the right design or not, by design OperationCanceledExceptions are thrown for timeouts (and TaskCanceledException is an OperationCanceledException).
cc: @davidsh

Our team found this unintuitive, but it does seem to be working as designed. Unfortunately, when we hit this, we wasted a bit of time trying to find a source of cancelation. Having to handle this scenario differently from task cancelation is pretty ugly (we created custom handler to manage this). This also seems to be an overuse of cancelation as a concept.

Thanks again for thee quick response.

Seems like by design + there were only 2 customers complains in last 6+ months.
Closing.

@karelz adding a note as another customer impacted by this misleading exception :)

Thanks for info, please upvote also top post (issues can be sorted by it), thanks!

Whether the right design or not, by design OperationCanceledExceptions are thrown for timeouts

This is a bad design IMO. There's no way to tell if the request was actually canceled (i.e. the cancellation token passed to SendAsync was canceled) or if the timeout was elapsed. In the latter case, it would make sense to retry, whereas in the former case it wouldn't. There's a TimeoutException that would be perfect for this case, why not use it?

Yeah, this also threw our team for a loop. There's an entire Stack Overflow thread dedicated to trying to deal with it: https://stackoverflow.com/questions/10547895/how-can-i-tell-when-httpclient-has-timed-out/32230327#32230327

Awesome, thanks.

Reopening as the interest is now bigger - the StackOverflow thread has now 69 upvotes.

cc @dotnet/ncl

Any news ? Still happen in dotnet core 2.0

It is on our list of top problems for post-2.1. It won't be fixed in 2.0/2.1 though.

We have several options what to do when timeout happens:

  1. Throw TimeoutException instead of TaskCanceledException.

    • Pro: Easy to distinguish timeout from explicit cancellation action at runtime

    • Con: Technical breaking change - new type of exception is thrown.

  2. Throw TaskCanceledException with inner exception as TimeoutException

    • Pro: Possible to distinguish timeout from explicit cancellation action at runtime. Compatible exception type.

    • Open question: Is it ok to throw away original TaskCanceledException stack and throw a new one, while preserving InnerException (as TimeoutException.InnerException)? Or should we preserve and just wrap the original TaskCanceledException?



      • Either: new TaskCanceledException -> new TimeoutException -> original TaskCanceledException (with original InnerException which may be null)


      • Or: new TaskCanceledException -> new TimeoutException -> original TaskCanceledException.InnerException (may be null)



  3. Throw TaskCanceledException with message mentioning timeout as the reason

    • Pro: Possible to distinguish timeout from explicit cancellation action from the message / logs

    • Con: Cannot be distinguished at runtime, it is "debug only".

    • Open question: Same as in [2] - should we wrap or replace the original TaskCanceledException (and it stack)

I am leaning towards option [2], with discarding original stack of original TaskCanceledException.
@stephentoub @davidsh any thoughts?

BTW: The change should be fairly straightforward in HttpClient.SendAsync where we set up the timeout:
https://github.com/dotnet/corefx/blob/30fb78875148665b98748ede3013641734d9bf5c/src/System.Net.Http/src/System/Net/Http/HttpClient.cs#L433-L461

The top-level exception should always be an HttpRequestException. That is the standard API model for HttpClient.

Within that, as inner exception, "timeouts" should probably be a TimeoutException. In fact, if we were to combine this with the other issues (about modifying HttpRequestion to include a new enum property similar to WebExceptionStatus), then developers would only have to check that enum and not have to inspect inner exceptions at all.

This is basically option 1) but continuing to preserve current app-compat API model that top-level exceptions from HttpClient APIs are always HttpRequestException.

Note: that any "timeouts" from directly reading an HTTP response stream from Stream APIs like:

```c#
var client = new HttpClient();
Stream responseStream = await client.GetStreamAsync(url);

int bytesRead = await responseStream.ReadAsync(buffer, 0, buffer.Length);
```

should continue to throw System.IO.IOException etc. as top-level exceptions (we actually have a few bugs in SocketsHttpHandler about this).

@davidsh are you suggesting to throw HttpRequestException even in cases when there is explicit user cancellation? Those throw TaskCanceledException today and it seems fine. Not sure if it is something we want to change ...
I'd be ok to expose client-side timeouts as HttpRequestException instead of TaskCanceledException.

@davidsh are you suggesting to throw HttpRequestException even in cases when there is explicit user cancellation? Those throw TaskCanceledException today and it seems fine. Not sure if it is something we want to change …

I need to test out the different behaviors of the stacks including .NET Framework to verify the exact current behaviors before I can answer this question.

Also, in some cases, some of our stacks are throwing HttpRequestException with inner exception of OperationCanceledException. TaskCanceledException is a derived type of OperationCanceledException.

when there is explicit user cancellation

We also need clear definition of what "explicit" is. For example, is setting the HttpClient.Timeout property "explicit"? I think we're saying no. What about calling .Cancel() on the CancellationTokenSource object (which affects the passed in CancellationToken object)? Is that "explicit"? There are probably other examples we can think of.

.NET Framework also throws TaskCanceledException when you set HttpClient.Timeout.

If you can't distinguish between cancellation from client and cancellation from implementation - just check both. Before using client token create derived one:
```c#
var clientToken = ....
var innerTokenSource = CancellationTokenSource.CreateLinkedTokenSource(clientToken);

If at some point you catch **OperationCancelledException**:
```c#
try
{
     //invocation with innerToken
}
catch(OperationCancelledException oce)
{
      if(clientToken.IsCancellationRequested)
          throw; //as is, it is client initiated and in higher priority
      if(innerToken.IsCancellationRequested)
           //wrap it in HttpException, TimeoutException, whatever, wrap it in another linked token until you process all cases.
}

Basically you filter all possible layers from client token to deepest one, until you find token which caused cancellation.

Any updates on this making 2.2 or 3.0? @karelz @davidsh, cc @NickCraver . I kind of wish it would just do option 1) of throwing a TimeoutException, or alternatively a new HttpTimeoutException that is derived from HttpRequestException. Thanks!

Any updates on this making 2.2 or 3.0?

Too late for 2.2, it was released last week ;)

Too late for 2.2, it was released last week ;)

Argh, that's what happens when I take a break ;)

Albeit a little goofy, the most simplistic workaround I've found is to nest a catch handler for OperationCanceledException and then evaluate the value of IsCancellationRequested to determine whether or not the cancellation token was the result of a timeout or an actual session abortion. Clearly the logic performing the API call would likely be in a business logic or service layer but I've consolidated the example for simplicity:

````csharp
[HttpGet]
public async Task Index(CancellationToken cancellationToken)
{
try
{
//This would normally be done in a business logic or service layer rather than in the controller itself but I'm illustrating the point here for simplicity sake
try
{
//Using HttpClientFactory for HttpClient instances
var httpClient = _httpClientFactory.CreateClient();

        //Set an absurd timeout to illustrate the point
        httpClient.Timeout = new TimeSpan(0, 0, 0, 0, 1);

        //Perform call that requires special timeout logic                
        var httpResponse = await httpClient.GetAsync("https://someurl.com/api/long/running");

        //... (if GetAsync doesn't fail, handle the response as desired)
    }
    catch (OperationCanceledException innerOperationCanceled)
    {
        //If a canceled token exception occurs due to a timeout, "IsCancellationRequested" should be false
        if (cancellationToken.IsCancellationRequested)
        {
            //Bubble exception to global handler
            throw innerOperationCanceled;
        }
        else
        {
            //... (perform timeout logic here)
        }                    
    }                

}
catch (OperationCanceledException operationCanceledEx)
{
    _logger.LogWarning(operationCanceledEx, "Request was aborted by the end user.");
    return new StatusCodeResult(499);
}
catch (Exception ex)
{
    _logger.LogError(ex, "An unexepcted error occured.");
    return new StatusCodeResult(500);
}

}
````

I've seen other articles about how to handle this scenario more elegantly but they all require digging into the pipeline etc. I realize that this approach isn't perfect by my hope is that there will be better support for actual timeout exceptions with a future release.

Working with http client is utterly abysmal. HttpClient needs deleted and started over from scratch.

Albeit a little goofy, the most simplistic workaround I've found is to nest a catch handler for OperationCanceledException and then evaluate the value of IsCancellationRequested to determine whether or not the cancellation token was the result of a timeout or an actual session abortion. Clearly the logic performing the API call would likely be in a business logic or service layer but I've consolidated the example for simplicity:

[HttpGet]
public async Task<IActionResult> Index(CancellationToken cancellationToken)
{
  try
  {
      //This would normally be done in a business logic or service layer rather than in the controller itself but I'm illustrating the point here for simplicity sake
      try
      {
          //Using HttpClientFactory for HttpClient instances      
          var httpClient = _httpClientFactory.CreateClient();

          //Set an absurd timeout to illustrate the point
          httpClient.Timeout = new TimeSpan(0, 0, 0, 0, 1);

          //Perform call that requires special timeout logic                
          var httpResponse = await httpClient.GetAsync("https://someurl.com/api/long/running");

          //... (if GetAsync doesn't fail, handle the response as desired)
      }
      catch (OperationCanceledException innerOperationCanceled)
      {
          //If a canceled token exception occurs due to a timeout, "IsCancellationRequested" should be false
          if (cancellationToken.IsCancellationRequested)
          {
              //Bubble exception to global handler
              throw innerOperationCanceled;
          }
          else
          {
              //... (perform timeout logic here)
          }                    
      }                

  }
  catch (OperationCanceledException operationCanceledEx)
  {
      _logger.LogWarning(operationCanceledEx, "Request was aborted by the end user.");
      return new StatusCodeResult(499);
  }
  catch (Exception ex)
  {
      _logger.LogError(ex, "An unexepcted error occured.");
      return new StatusCodeResult(500);
  }
}

I've seen other articles about how to handle this scenario more elegantly but they all require digging into the pipeline etc. I realize that this approach isn't perfect by my hope is that there will be better support for actual timeout exceptions with a future release.

It's absolutely bonkers we need to write this atrocious code. HttpClient is the leakiest abstraction ever created by Microsoft

I agree that it seems rather short sighted to not include more specific exception information around timeouts. Seems like a pretty basic thing for people to want to handle actual timeouts differently than cancelled token exceptions.

Working with http client is utterly abysmal. HttpClient needs deleted and started over from scratch.\

That's not very helpful or constructive feedback @dotnetchris . Customers and MSFT employees alike are all here to try and make things better, and that's why folks like @karelz are still here listening to customer feedback and trying to improve. There is no requirement for them to do development in the open, and let's not burn their good will with negativity or they could decide to take their ball and go home, and that would be worse for the community. Thanks! :)

Why not just a create HttpClient2 (temporary name) that has the expected behavior and inform, in the documentation, that "HttpClient is obsolete, please use the HttpClient2" and explained the difference between the 2 of them.

This ways no breaking change are inserted in the .NetFramework.

HttpClient shouldn't be based off throwing exceptions. It's not exceptional that an internet host is unavailable. It's actually expected, given the infinite permutations of addresses, the exceptional case is giving it something that works.

I even blogged about how bad using HttpClient is over 3 years ago. https://dotnetchris.wordpress.com/2015/12/11/working-with-httpclient-is-absurd/

This is glaringly bad design.

Has anyone in microsoft even used HttpClient? It sure doesn't seem like it.

Then beyond this horrible api surface, there's all kinds of leaky abstractions regarding the construction of it and threading. I can't believe it's possible to be salvageable and only brand new is a viable solution. HttpClient2, HttpClientSlim is fine or whatever.

Http access should never throw exceptions. It should always return a result, the result should be able to be inspected for completion, timeout, http response. The only reason it should be allowed to throw is if you invoke EnsureSuccess() then it's fine to throw any exception in the world. But there should be zero control flow handling for entirely normal behavior of the internet: host up, host not up, host timeout. None of those are exceptional.

HttpClient shouldn't be based off throwing exceptions. It's not exceptional that an internet host is unavailable. It's actually expected, given the infinite permutations of addresses, the exceptional case is giving it something that works.

@dotnetchris since it sounds like you have a lot of passion for this topic, why not take the time to write up your proposed full API for HttpClient2 and submit it to the team for discussion as a new issue?

I haven't bothered diving into the nuts and bolts of what makes HttpClient good or bad from a deeper technical level so I don't have anything to add to the comments of @dotnetchris. I definitely don't think it's appropriate to criticize the MS dev team but I also understand how frustrating it is to spend hours and hours researching solutions to situations that seemingly should be super simple to handle. It's the frustration of struggling to understand why certain decisions/changes were made, hoping that someone at MS will address the concern while still trying to get your work done and delivered in the meantime.

In my case, I have an external service I call during the OAuth process to establish user roles and if that call times out within a specified time span (which unfortunately happens more than I'd like), I need to revert to a secondary method.

Really there's a myriad of other reasons I can see why someone would want to be able to distinguish between a timeout and a cancellation. Why better support for a common scenario like this doesn't seem to exist is what I struggle to understand and I truly hope it's something the MS team will give some additional thought to.

@karelz what would be required to get this on schedule for 3.0, an API proposal from someone?

This is not about API proposal. This is about finding the right implementation. It is on our 3.0 backlog and fairly high on the list (behind HTTP/2 and enterprise scenarios). It is very good chance to be addressed in 3.0 by our team.
Of course if someone in the community is motivated, we will take a contribution as well.

@karelz how can the community help in this case? The reason I mentioned if an API proposal was needed is that I wasn't sure from your message that starts "We have several options what to do when timeout happens:" if the team had decided on their preferred alternative from the ones you listed. If you have decided on a direction then we can go from there. Thanks!

Technically speaking it is not API proposal question. The decision will not require API approval. However, I agree that we should be aligned on how to expose the difference - there will likely be some discussion.
I don't expect that the options differ from each other too much in the implementation -- if community wants to help, getting implementation of any of the options will get us 98% there (assuming the exception is thrown from one place and can be easily changed later to adapt to [1]-[3] options listed above).

Hi! We hit this issue recently as well. Just curious if it's still on your radar for 3.0 ?

Yes, it is still on our 3.0 list - I would still give it 90% chance it will get addressed in 3.0.

I ran across this today. I set a timeout on HttpClient and it gives me this

{System.Threading.Tasks.TaskCanceledException: A task was canceled.}
CancellationRequested = true

I agree with others this is confusing as it doesnt mention timeout being the issue.

Umm, ok I'm slightly confused. Everyone in this thread seems to be saying that HTTP timeouts throw a TaskCanceledException but that's not what I'm getting...I'm getting an OperationCanceledException in .NET Core 2.1...

If you want to throw an OperationCanceledException for timeouts then fine I can work around that, but I just spent hours trying to trace down what the heck was going on because it's not documented properly:

https://docs.microsoft.com/en-us/dotnet/api/system.net.http.httpclient.sendasync

It clearly states that an HttpRequestException is thrown for timeouts...

TaskCanceledException derives from OperationCanceledException.

@stephentoub Yes, but I'm not getting a TaskCanceledException, I'm getting an OperationCanceledException.

As in, a catch block that catches TaskCanceledException does not catch the exception, it specifically needs a catch block for an OperationCanceledException.

I'm now specifically testing for HTTP timeouts by filtering on the type of exception, and the way I know it's a timeout is if it is NOT a TaskCanceledException:

```c#
try {
using (var response = await s_httpClient.SendAsync(request, _updateCancellationSource.Token).ConfigureAwait(false))
return (int)response.StatusCode < 400;
}
catch (OperationCanceledException ex) when (ex.GetType() != typeof(TaskCanceledException)) {
// HTTP timeout
return false;
}
catch (HttpRequestException) {
return false;
}

or alternatively this works as well, which might be better:

```c#
            try {
                using (var response = await s_httpClient.SendAsync(request, _updateCancellationSource.Token).ConfigureAwait(false))
                    return (int)response.StatusCode < 400;
            }
            catch (OperationCanceledException ex) when (ex.GetType() == typeof(OperationCanceledException)) {
                // HTTP timeout
                return false;
            }
            catch (HttpRequestException) {
                return false;
            }

Umm, ok I'm slightly confused. Everyone in this thread seems to be saying that HTTP timeouts throw a TaskCanceledException but that's not what I'm getting...I'm getting an OperationCanceledException in .NET Core 2.1...

If you want to throw an OperationCanceledException for timeouts then fine I can work around that, but I just spent hours trying to trace down what the heck was going on because it's not documented properly:

https://docs.microsoft.com/en-us/dotnet/api/system.net.http.httpclient.sendasync

It clearly states that an HttpRequestException is thrown for timeouts...

Did you read through the entire thread because there's actually several responses where OperationCanceledExecption is referenced, including the code example I posed that shows a way (not as clean as I'd like but it's a way) to handle timeouts explicitly.

@Hobray Yes I have - what makes you think I didn’t? That still doesn’t explain how many people in this thread are supposedly getting TaskCanceledException on timeouts but I’m not.

I saw your code example as well but unfortunately it has a race condition in it that I don’t believe you considered. The code is posted that I’m using avoids this problem and doesn’t require access to the cancellation token, which can be important if you are trying to catch the timeout further upstream.

@mikernet Maybe the wording threw me off. You're right that I hadn't considered a possible race condition. I realize that it's not an awesome solution but if you'd like to explain what I overlooked, I'd like to hear it.

One thing I've noticed is that the exact exception you get does seem to vary depending on whether or not the timeout or cancellation happens in a Controller vs middleware invocation. Inside of a middleware invocation it seems to just be TaskCanceledException. Unfortunately I don't have the time to dig through the .NET Core code libraries to understand why but it's something I've noticed.

@Hobray Haha yeah of course - sorry, I didn't mean to be cryptic about the race condition, I was just on my phone while I was on the go typing that last message which made it difficult to explain.

The race condition in your solution happens if a task cancellation is initiated between the HttpClient timeout and the point where you check the cancellation token cancellation state. In my situation I need a task cancellation exception to keep bubbling up so that upstream code can detect the user cancellation thus I need to be able to catch the exception only if it's actually timeout - that's why I'm using when to filter the catch. Using your solution of checking the token can result in an unhandled OperationCanceledException passing the filter and bubbling up and crashing the program, because someone cancelling the operation upstream will be expecting a TaskCanceledException to be thrown, not an OperationCanceledException.

If the code calling the HttpClient method is supposed to handle both the timeout and the task cancellation then the distinction between timeout and user cancellation when the race condition happens may not be important to you, depending on the situation...but if you're writing library level code and need to let the task cancellation bubble up to be handled at a higher level then it definitely makes a difference.

If what you're saying is true and timeouts throw different exception types depending on what the calling context is then that definitely seems like a problem to me that needs to be addressed. There's no way that is or should be the intended behavior.

Interesting on the possible race condition. I hadn’t considered that. Luckily, for my use case it wouldn’t be much of an issue.

As far as exception variations, I need to find some time to dumb down the middleware I’ve been working on to see if it’s consistent with what I’ve seen a few times while debugging. The stupid Chrome omnibox preload calls has been where I’ve seen it in my middleware when the timing has been just right. I think I can create a simple example to definitively proove it one way or the other.

I am still unclear as to what the correct workaround should be.

My solution is the most correct in situations where http timeouts throw OperationCanceledException. I can’t reproduce the situation where TaskCanceledException is thrown and nobody has provided a way to reproduce that behaviour, so I say test your environment and if it matches mine then that’s the correct solution.

If in some execution contexts a TaskCanceledException is thrown then my solution fails as a general purpose solution for those contexts and the other solution proposed is the “best” solution but contains a race condition that may or may not be important to your situation, that’s for you to evaluate.

I would love to hear more details from someone who is claiming they are getting TaskCanceledExceptions though because that would clearly be a huge bug. I’m somewhat skeptical that this is actually happening.

@mikernet whether TaskCanceledException or OperationCanceledException is thrown is largely irrelevant. That's an implementation detail you shouldn't be relying upon. What you can do is catch OperationException (which will also catch TaskCanceledException), and check whether the cancellation token you passed is canceled, as described in Hobray's comment (https://github.com/dotnet/corefx/issues/20296#issuecomment-449510983).

@thomaslevesque I've already addressed why that approach is a problem. Without addressing that, your comment isn't very useful - it's merely rehashing statements before I joined the conversion without addressing the problem I described.

As for "that's an implementation detail" - yes it is, but unfortunately I have to rely on it because otherwise there isn't a good way to determine whether the task was actually canceled or timed out. There 100% should be a way to determine from the exception which of those two scenarios occurred as to avoid the race condition described. OperationCanceledException is just a poor choice because of this...there is already a TimeoutException in the framework that makes for a better candidate.

Exceptions for asynchronous tasks shouldn't rely on checking data in another object that can also be modified asynchronously to determine the cause of the exception if a common use case is to make a branching decision based on the exception's cause. That's begging for race condition problems.

Also, in terms of the exception type being an implementation detail...I don't really think that's true for exceptions, at least not in terms of well established .NET documentation practices. Name one other example in the .NET Framework where the documentation says "throws exception X when Y happens" and the framework actually throws a publicly accessible subclass of that exception, let alone a publicly accessible subclass that is used for a completely different exception case on the same method! That just seems like bad form and I have never run into this before in my 15 years of .NET programming.

Regardless, TaskCanceledException should be reserved for user initiated task cancelation, not for timeouts - like I said, we already have an exception for that. Libraries should be capturing any internal task cancellations and throwing more appropriate exceptions. Exception handlers for task cancellations that were initiated with task cancellation tokens shouldn't be responsible for dealing with other unrelated problems that might happen during the async operation.

@mikernet sorry, I missed that part of the discussion. Yes, there's a race condition I hadn't thought about. But it might not be as big of a problem as it seems... The race condition occurs if the cancellation token is signaled after the timeout occurs but before you check for the token's state, right? In this case, it doesn't really matter if a timeout occurred if the operation is being canceled anyway. So you'll let a OperationCanceledException bubble up, letting the caller believe that the operation was successfully canceled, which is a lie since actually a timeout occurred, but it doesn't matter to the caller, since they wanted to abort the operation anyway.

There 100% should be a way to determine from the exception which of those two scenarios occurred as to avoid the race condition described. OperationCanceledException is just a poor choice because of this...there is already a TimeoutException in the framework that makes for a better candidate.

I completely agree with that, the current behavior is clearly wrong. In my opinion the best option to fix it is option 1 from @karelz's comment. Yes, it's a breaking change, but I suspect most code bases are not handling this scenario correctly anyway, so it won't make things worse.

@thomaslevesque Well yes, but I kind of addressed that. It may or may not be important for some applications. In mine it is though. The user of my library method isn't expecting OperationCanceledException when they cancel a task - they are expecting TaskCanceledException and they should be able to catch that and be "safe". If I check the cancellation token, see that it was cancelled and pass on the exception then I have just passed on an exception that won't be caught by their handler and will crash the program. My method isn't supposed to throw on timeouts.

Sure, there are ways around this...I can just throw a new wrapped TaskCanceledException but then I lose the top level stack trace, and HTTP timeouts are treated much differently than task cancellations in my service - HTTP timeouts go into a cache for a long time, whereas task cancellations will not. I don't want to miss caching actual timeouts due to the race condition.

I guess I could catch OperationCanceledException and check the cancellation token - if it's been canceled then also check the type of exception to make sure it's a TaskCanceledException before rethrowing to avoid the whole program crashing thing. If any of those conditions is false then it must be a timeout. Only problem left in that case is the race condition which is subtle, but when you have thousands of HTTP requests happening per minute on a heavily loaded service with tight timeouts set it does happen.

Either way that's hacky and as you've indicated should be resolved somehow.

(The race condition will only happen in environments where TaskCanceledException is thrown instead of OperationCanceledException, wherever that may be, but at least the solution is safe from bubbling up an unexpected exception type and doesn't rely on implementation detail in order to still work reasonably well if the implementation detail changes).

The timeout exception thrown by HttpClient in case of http timeout shouldn't be of type System.Net.HttpRequestException (or a descendant?) instead of generic System.TimeoutException? For example, the good old WebClient throws WebException with a Status property which can be set to WebExceptionStatus.Timeout.

Sadly, we won't be able to finish this for 3.0. Moving to Future.
It is still on the top of our backlog, but realistically we won't be able to do it for 3.0 :(.

@karelz oh no :( I was just going to repost saying how I ran into this again at the second company in a row. How long would the window be to submit a PR for 3.0 if I wanted to tackle it myself? Cheers

I'm having a hard time tracking exactly what's going on in the bunch of comments above with people reporting differences between getting TaskCancelledException and OperationCancelledException depending on the context, can anyone from MSFT illuminate that?

Like this comment from @mikernet :

As in, a catch block that catches TaskCanceledException does not catch the exception, it specifically needs a catch block for an OperationCanceledException.

cc @davidsh @stephentoub maybe?

TaskCanceledException derives from OperationCanceledException, and carries a bit of additional information. Just catch OperationCanceledException.

@karelz would option 1 you posted originally, just making it throw a Timeout exception, be completely unpalatable?

Things like StackExchange.Redis throw a exception derived from Timeout exception, that's why it would be nice to just catch Timeout exception for HttpClient...

I understand @stephentoub given the inheritance, just this comment below from @mikernet is blowing my mind... This behavior he describes would have to be some kind of bug?? But if true, would influence the fix or workaround…

"a catch block that catches TaskCanceledException does not catch the exception, it specifically needs a catch block for an OperationCanceledException"

Or are you saying that some places it throws an OperationEx and other places a TaskEx ? Like maybe people are seeing something coming from ASP middleware and thinking it's from HttpClient?

just this comment below from @mikernet is blowing my mind... This behavior he describes would have to be some kind of bug??

Why is that "blowing your mind"? If exception B derives from exception A, a catch block for B will not catch an exception that is concretely A, e.g. if I have a catch block for ArgumentNullException, it will not catch " throw new ArgumentException(...)".

Or are you saying that some places it throws an OperationEx and other places a TaskEx ?

Right. In general code should assume only OCE; the way things are written, some code may end up throwing the derived TCE, e.g. a task completed via TaskCompletionSource.TrySetCanceled will produce a TaskCanceledException.

Thanks Stephen, from the sounds of things in the thread multiple people were under the impression that TaskCE was being thrown everywhere, which would have made the behavior mentioned by mikernet make no sense... Now that you've clarified , it makes makes his observation logical :)

@stephentoub, is this guidance documented anywhere? Pinging @guardrex

In general code should assume only OCE...

I work the ASP.NET doc set these days. Open an issue for the dotnet/docs cats ...

https://github.com/dotnet/docs/issues

Is it possible to restart this conversation back where there were some concrete proposals with @karelz and @davidsh ? It just feels like someone(s) that have ownership of the HttpClient API model need to make decisions about the direction to go. There was some discussion about top-level exceptions always being HttpRequestException (other than explicit cancellations??), mixed in with some longer-term seeming topic about a larger refactor to modify HttpRequestion to include a new enum property similar to WebExceptionStatus...

It just feels like this whole issue is getting squirreled relative to the actual code change needed. ARe there other people who have a stake/ownership in the HttpClient API model that should be looped in in order to make some decisions? Thanks a ton all!

I'm having the same issue here -- however, cancellationToken.IsCancellationRequested is returning true:

    public class TimeoutHandler : DelegatingHandler
    {
        protected override async Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
        {
            try
            {
                var response = await base.SendAsync(request, cancellationToken);
                response.EnsureSuccessStatusCode();

                return response;
            }
            catch (OperationCanceledException) when (!cancellationToken.IsCancellationRequested)
            {
                throw new TimeoutException("Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.");
            }
        }

I'm forcing a timeout. The exception raised is TaskCancelledException with IOException as inner exception:

System.Threading.Tasks.TaskCanceledException: The operation was canceled. ---> System.IO.IOException: Unable to read data from the transport connection: A operação de E/S foi anulada devido a uma saída de thread ou a uma requisição de aplicativo. ---> System.Net.Sockets.SocketException: A operação de E/S foi anulada devido a uma saída de thread ou a uma requisição de aplicativo
   --- End of inner exception stack trace ---
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error)
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.GetResult(Int16 token)
   at System.Net.Http.HttpConnection.FillAsync()
   at System.Net.Http.HttpConnection.ReadNextResponseHeaderLineAsync(Boolean foldedHeadersAllowed)
   at System.Threading.Tasks.ValueTask`1.get_Result()
   at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithNtConnectionAuthAsync(HttpConnection connection, HttpRequestMessage request, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithRetryAsync(HttpRequestMessage request, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)

...

(Sorry for the translated exception text -- apparently someone thought it would be a good idea to bring this to .NET Core as well)

As you can see, I was hoping to check IsCancellationRequest to workaround this issue as other have suggested, but even on timeouts it's returning true.

Currently I'm thinking of implementing a timer to check if the configured timeout has passed. But, obviously i wanna die this is a horrible workaround.

Hi @FelipeTaveira,

Your approach doesn't work, because the cancellation is handled upstream of your TimeoutHandler, by the HttpClient itself. Your handler receives a cancellation token that is cancelled by the HttpClient when a Timeout occurs. So checking that the cancellation token is not cancelled only works in code that is calling HttpClient, not in code called by HttpClient.

The way to make it work with a custom handler is to disable the HttpClient's Timeout by setting it to Timeout.InfiniteTimeSpan, and controlling the timeout behavior in your own handler, as shown in this article (full code here)

@thomaslevesque Oh, thanks! I had read through your code and thought that part was about supporting per-request timeout config and hadn't noticed that it was also necessary to support the TimeoutException thing.

I hope this is addressed in the future.

For what it is worth, I also find TaskCanceledException being thrown on timeouts to be confusing.

I can attest that TaskCanceledExceptions were a major source of confusion when dealing with production issues in my previous job. We'd generally wrap calls to HttpClient and throw TimeoutExceptions where appropriate.

@eiriktsarpalis thanks. We plan to address it in 5.0, so you can pick it up in that time frame if you want, now that you joined our team 😇.

1 year? 😵 Where is that promised agility of .NET Core?

Judging agility of entire project based on one issue is a bit ... unfair.
Yes, it is a bit embarrassing, we still have this problem, however, it is hard to fix (with compat impact) and there were just higher-pri business goals to tackle first (in networking) - see the changes in milestones throughout history of this issue.

BTW: Like in many other OSS projects, there is no SLA or ETAs or promises on when issue will be fixed. All depends on priorities, other work, difficulty and number of experts being able to tackle the problem.

Judging agility of entire project based on one issue is a bit ... unfair.

Well, it was an honest question. It's been already two years and when I saw that it is going to take another one I was a bit surprised. No offense meant though 🙂. I know that in order to fix this issue there were/are a few questions that had/have to be answered to first.

In a world of compromises... I don't like it but here's another option to chew on:

We could continue throwing TaskCanceledException, but set its CancellationToken to a sentry value.

c# catch(TaskCanceledException ex) when(ex.CancellationToken == HttpClient.TimeoutToken) { }

And another: throw a new HttpTimeoutException which derives from TaskCanceledException.

And another: throw a new HttpTimeoutException which derives from TaskCanceledException.

I think that would address all concerns. It makes it easy to specifically catch a timeout, and it's still a TaskCanceledException, so code that currently catches that exception will also catch the new one (no breaking change).

And another: throw a new HttpTimeoutException which derives from TaskCanceledException.

It seems rather ugly, but probably a good compromise.

And another: throw a new HttpTimeoutException which derives from TaskCanceledException.

I think that would address all concerns. It makes it easy to specifically catch a timeout, and it's still a TaskCanceledException, so code that currently catches that exception will also catch the new one (no breaking change).

Well, there _could_ be people out there writing code like this:

try
{
  // ...
}
catch (Exception ex) when (ex.GetType() == typeof(TaskCanceledException))
{
}

So this would be technically a breaking change.

Also I think it's got a bit of the worst of both worlds. We still won't get a general TimeoutException at the same time having a breaking change.

there could be people out there writing code like this so this would be technically a breaking change

FWIW, pretty much every change is possibly breaking, so to make forward progress on anything one has to draw a line somewhere. In corefx we don't consider returning or throwing a more derived type to be breaking.
https://github.com/dotnet/corefx/blob/master/Documentation/coding-guidelines/breaking-change-rules.md#exceptions

In a world of compromises... I don't like it but here's another option to chew on:

We could continue throwing TaskCanceledException, but set its CancellationToken to a sentry value.

catch(TaskCanceledException ex) when(ex.CancellationToken == HttpClient.TimeoutToken)
{
}

I really like this one. Is there any other disadvantage to it besides being ""ugly""? What is ex.CancellationToken today when a timeout happens?

Is there any other disadvantage to it besides being ""ugly""

It would be really easy for someone to think that TimeoutToken can be passed around and used to signal cancellation when a timeout occurs, since this is the design that's used elsewhere for such things.

I really like this one. Is there any other disadvantage to it besides being ""ugly""? What is ex.CancellationToken today when a timeout happens?

It's much less intuitive than catching a specific exception type, which can be explicitly mentioned in documentation.

I am used to consider a OperationCanceledException and derived types to signify an interrupted execution, not an application error. The exception HttpClient throws on timeout is usually an error - there was a call to a HTTP API, and it was unavailable. This behavior either leads to complications while processing exceptions on a higher level, for example, in ASP.NET Core pipeline middleware, since this code has to know that there are two flavors of OperationCanceledException, error and not-error, or it forces you to wrap every HttpClient call in a throw-catch block.

And another: throw a new HttpTimeoutException which derives from TaskCanceledException.

It seems rather ugly, but probably a good compromise.

Is it really that ugly though, given the design constraints involved and backwards-compatibility considerations? I don't know that anyone in this thread has proposed an obviously superior solution that has zero drawbacks (whether usage or purity of design etc)

We want to look at this deeper than just changing the exception type. There is also a long-standing issue of "where did we timeout?" -- for instance, it is possible for a request to timeout without ever having hit the wire -- that we need to look at, and a solution should be compatible with the direction we go there.

We want to look at this deeper than just changing the exception type. There is also a long-standing issue of "where did we timeout?" -- for instance, it is possible for a request to timeout without ever having hit the wire -- that we need to look at, and a solution should be compatible with the direction we go there.

Is this really something you need to know? I mean, will you handle the error differently depending on where it timed out? For me, just knowing that a timeout occurred is usually enough.
I don't mind if you want to do more, I just don't want this change to miss the .NET 5 train because of additional requirements :wink:

I think the issue @scalablecory raised is noteworthy, but I agree that it shouldn't hinder this getting fixed sooner. You can decide on a mechanismism to allow the timeout reason to be evaluated but please dont delay fixing the exception issue for that...you can always add a Reason property to the exception later.

I agree with @thomaslevesque; I've not heard that as a big problem before. Where is that coming from?

Is that a fairly niche debugging case where it would be OK to have digging into the reason to require a little explicit action, whereas for general developer they’d only have to know/care about handling a single thing.

We want to look at this deeper than just changing the exception type. There is also a long-standing issue of "where did we timeout?" -- for instance, it is possible for a request to timeout without ever having hit the wire -- that we need to look at, and a solution should be compatible with the direction we go there.

I presume the implication here is that it should be the responsibility of each HttpMessageHandler to throw the right timeout exception type rather than HttpClient fixing things up.

I presume the implication here is that it should be the responsibility of each HttpMessageHandler to throw the right timeout exception type rather than HttpClient fixing things up.

I don't see how that's possible with the abstraction defined today. HttpMessageHandler has no knowledge of HttpClient's Timeout; as far as an HttpMessageHandler is concerned, it's just handed a cancellation token, which could have cancellation requested for any number of reasons, including the caller's cancellation token being signaled, the HttpClient's CancelPendingRequests getting called, the HttpClient's enforced timeout expiring, etc.

Is this really something you need to know? I mean, will you handle the error differently depending on where it timed out? For me, just knowing that a timeout occurred is usually enough.

It's useful to know if the server may have processed your request. It's also great for diagnostics -- a remote server not timing out but the client reporting timeouts is a frustrating experience.

@davidsh may have some more insight.

I agree with @thomaslevesque; I've not heard that as a big problem before. Where is that coming from?

It was a point raised during a recent NCL meeting. It may be we decide this is not important enough to expose or let delay things, but it's worth a quick discussion first.

@dotnet/ncl @stephentoub Lets put this one to bed. We know that anything we do will be subtly breaking. This seems like the least worst option. I propose moving forward with:

throw a new HttpTimeoutException which derives from TaskCanceledException.

Objections?

throw a new HttpTimeoutException which derives from TaskCanceledException.

How do we do this in practice? It probably means we have to scrub our code and find all places that call CancellationToken.ThrowIfCancellationRequested and instead convert it to:

c# if (token.IsCancellationRequested) throw new HttpTimeoutException(EXTRASTUFF, token);

as well as an explicit places we might actually be calling OperationCancelledException as well.

Is there any option that doesn't require going over all that related code anyway?

Is there any option that doesn't require going over all that related code anyway?

Probably not but there might be places we don't control (not really sure yet) and so we'll need to potentially grab a random TaskCancelledException/OperationCancelledException and rethrow the new HttpTimeoutException.

It probably means we have to scrub our code and find all places that call CancellationToken.ThrowIfCancellationRequested and instead convert it to:

Is there any option that doesn't require going over all that related code anyway?

Not exactly.

As far as handlers are concerned, all they receive is a CancellationToken: they don't know and don't care where that token came from, who produced it, or why it might have cancellation requested. All they know is that cancellation at some point might be requested, and they throw a cancellation exception in response. So, there's nothing to do here in the handlers.

It's HttpClient itself that is creating the relevant CancellationTokenSource and setting a timeout on it:
https://github.com/dotnet/corefx/blob/bdd33976fe3713cc3e78b83cf2a1176c70b793be/src/System.Net.Http/src/System/Net/Http/HttpClient.cs#L488
The timeout is entirely the invention of HttpClient itself, and only it knows whether it would want to treat a cancellation exception that emerged as being special.

That means the above referenced code would need to be refactored to track the timeout separately, either as a separate CancellationTokenSource, or more likely by handling the timeout logic on its own, and in addition to canceling the CTS, also setting a flag that it can then also check. This is likely going to be non-pay-for-play, in that we'll be adding additional allocations here, but from a code perspective it's relatively straightforward.

Then in places in HttpClient like:
https://github.com/dotnet/corefx/blob/bdd33976fe3713cc3e78b83cf2a1176c70b793be/src/System.Net.Http/src/System/Net/Http/HttpClient.cs#L536-L541
and
https://github.com/dotnet/corefx/blob/bdd33976fe3713cc3e78b83cf2a1176c70b793be/src/System.Net.Http/src/System/Net/Http/HttpClient.cs#L562-L566
when it catches an exception, it would need to special-case cancellation, to check to see whether the timeout expired, and if it had, assume it was the cause of the cancellation and throw a new exception of the desired type.

it would need to special-case cancellation, to check to see whether the timeout expired, and if it had, assume it was the cause of the cancellation and throw a new exception of the desired type.

How does this work if HttpClient.Timeout property is not involved? Perhaps it is set to "infinite". What about cancellation tokens passed directly into the HttpClient APIs (GetAsync, SendAync etc.)? I assume that would work the same?

How does this work if HttpClient.Timeout property is not involved?

I don't understand the question... that's the only timeout here we're talking about. If it's set to Infinite, then cancellation would never be caused by this timeout, and there'd be nothing to do. We already special-case Timeout == Infinite and would continue to:
https://github.com/dotnet/corefx/blob/bdd33976fe3713cc3e78b83cf2a1176c70b793be/src/System.Net.Http/src/System/Net/Http/HttpClient.cs#L481

What about cancellation tokens passed directly into the HttpClient APIs (GetAsync, SendAync etc.)? I assume that would work the same?

We combine whatever token is passed in with one of our own making:
https://github.com/dotnet/corefx/blob/bdd33976fe3713cc3e78b83cf2a1176c70b793be/src/System.Net.Http/src/System/Net/Http/HttpClient.cs#L485
We would track our own handling of HttpClient.Timeout, as mentioned in my previous comments. If the caller chooses to pass in a token that they've configured with a timeout, then treating that specially is up to them: as far as we're concerned, it's just a cancellation request.

Objections?

Why introduce a new derived exception type that "competes" with the existing TimeoutException rather than, say, throwing a cancellation exception that wraps a TimeoutException as the inner exception? If the goal is to enable a consumer to programmatically determine whether the cancellation was due to a timeout, would that suffice? It would seem to communicate the same information, without needing new surface area.

Why introduce a new derived exception type that "competes" with the existing TimeoutException rather than, say, throwing a cancellation exception that wraps a TimeoutException as the inner exception

For convenience, mostly. It's easier and more intuitive to write

catch(HttpTimeoutException)

than

catch(OperationCanceledException ex) when (ex.InnerException is TimeoutException)

That being said, either way is fine by me, as long as it's reasonably easy to identify a timeout.

say, throwing a cancellation exception that wraps a TimeoutException as the inner exception?

I think that's a good compromise. It would help disambiguate exception logs, which was my primary source of annoyance when dealing with these kinds of timeouts. It probably also better communicates the fact that this is the HttpClient cancelling a request, rather than a timeout being raised by the underlying handler. Also, no breaking changes.

Triage:

  • We will update this scenario to have an InnerException of TimeoutException with a message for logging/diagnostics that easily distinguishes the cancellation as being caused by the timeout triggering.
  • We will update documentation with guidance on ways one might programatically inspect to differentiate cancellation: checking for TimeoutException, using an infinite Timeout and passing a CancellationToken with a timeout period, etc.

We recognize that any solve here involves some compromise, and believe this will provide a good level of compatibility with existing code.

Why introduce a new derived exception type that "competes" with the existing TimeoutException rather than, say, throwing a cancellation exception that wraps a TimeoutException as the inner exception

For convenience, mostly. It's easier and more intuitive to write

catch(HttpTimeoutException)

than

catch(OperationCanceledException ex) when (ex.InnerException is TimeoutException)

That being said, either way is fine by me, as long as it's reasonably easy to identify a timeout.

@scalablecory I'm bummed that I have appeared to miss the discussion by a few hours - my experience communicating this type of things with general LoB .NET developers is that catching a derived HttpTimeoutException is way easier for people to understand/adopt than having to remember to use an exception filter on an inner exception. Not everyone even knows what an exception filter is.

At the same time, I totally recognize that this is one of those scenarios where there's no one clear winning/no-compromise solutions, like you said.

We agreed on that point @ericsampson but that would be breaking change for many existing users. As @scalablecory mentioned we compromised to make improvement while limiting compatibility damage.

When I think about the exceptions, in general, I consider the wrapping approach better. Deriving exception from OperationCanceledException makes such exception available only in asynchronous usage. And even it is tight to tasks. As we saw in the past there were several evolutions of async approaches and I consider task as an implementation detail however TimeoutException, for example, is part of socket concept which will not change.

We agreed on that point @ericsampson but that would be breaking change for many existing users. As @scalablecory mentioned we compromised to make improvement while limiting compatibility damage.

Although I don't necessarily disagree with the wrapping option, I think it's worth pointing out that - as far as I've understood - the issue is not that the derived option is a breaking change, but that the wrapping one is simpler. As @stephentoub mentioned earlier in the discussion, throwing a derived exception is not considered a breaking change by the corefx guidelines.

Deriving HttpTimeoutException from TaskCanceledException violates “is a” relationship between these two exceptions. It will still make calling code to think that cancellation happened, unless it specifically handles HttpTimeoutException. This is basically the same handling that we have to do at the moment when we check for cancellation token being canceled to determine whether it’s a timeout or not. Besides, having two timeout exceptions in core library is confusing.
Adding TimeoutException as an inner exception to TaskCanceledException will not provide any additional information in the most cases, as we can already check cancellation token and determine whether it’s timeout or not.

I think the only sensible solution is to change exception thrown on timeout to TimeoutException. Yes, it will be a breaking change, but the issue already spans several major releases and breaking changes is what major releases are for.

Can the documentation for existing versions be updated to indicate that methods could throw TaskCanceledException and/or OperationCanceledException (both concrete types appear to be a possibility) when a timeout occurs? Currently the documentation indicates HttpRequestException deals with timeouts, which is incorrect and misleading.

@qidydl that's the plan... will fully document any ways timeouts can manifest and how to detail with them.

Thanks @alnikola and @scalablecory and @davidsh and @karelz and everyone else involved in this discussion/implementation, I really appreciate it.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

btecu picture btecu  ·  3Comments

matty-hall picture matty-hall  ·  3Comments

omariom picture omariom  ·  3Comments

nalywa picture nalywa  ·  3Comments

sahithreddyk picture sahithreddyk  ·  3Comments