Httpx: Passing 'timeout=None' should always result in no timeout

Created on 3 Oct 2019  Â·  24Comments  Â·  Source: encode/httpx

Spun up from https://github.com/encode/httpx/issues/411

Currently, the behavior of passing None to the timeout parameter in the various request entrypoints is not consistent:

import httpx

url = "https://httpbin.org/delay/10"

# (1)
httpx.get(url, timeout=None)  # Times out after 5s

# (2)
client = httpx.Client(timeout=None)
client.get(url)  # Does not timeout, returns after 10s

# (3)
client = httpx.Client()
client.get(url, timeout=None)  # Times out after 5s

We should make sure passing None always results in the behavior (2), i.e. no timeout should be applied, as opposed to (1) and (3) where the default timeout configuration is used.


The implementation suggested in #411 is to use a sentinel object:

# models.py
UNSET = object()

# api.py
def get(..., timeout: TimeoutTypes = UNSET, ...):
    ...

We need to switch to using UNSET in all relevant places. For example (though these may not end up being relevant, as I have not solved this myself):

  • Functions in httpx/api.py.
  • Methods of BaseClient, Client and AsyncClient in httpx/client.py.
  • Methods of dispatchers in httpx/dispatch/, esp. .send().
  • Possibly other places that aren't on the top of my head at the time of writing this. :)

Next, we need to check for that sentinel instead of None when deciding whether to use the default timeout config.

Notable places affected by this switch are:

  • .send() methods in clients and dispatchers (only use the client's or dispatcher's self.timeout if timeout is UNSET, instead of None).
  • The .connect(), .read() and .write() methods on the concurrency backend classes:

https://github.com/encode/httpx/blob/86a0eb0268bdc4beeb90d8136546233cd8401b0e/httpx/concurrency/asyncio.py#L63-L67

user-experience

All 24 comments

I'd like to work on this.

@florimondmanca I don't understand the reason to use the sentinel value. Wouldn't be the same to keep DEFAULT_TIMEOUT_CONFIG as the default timeout value for httpx.Client() and all the top-level methods like httpx.get()? That way we would avoid having to check if timeout is UNSET.

@jcugat Good question, thanks, As it turns out, we already default DEFAULT_TIMEOUT_CONFIG in the constructors for Client, AsyncClient, ConnectionPool, HTTPConnection, etc. I'll edit the issue description based on this.

But there's a few design decisions about clients and dispatchers (which all expose a .send() method) that hint at using a sentinel value:

  • We should be able to instanciate a dispatcher independently, e.g. for testing or narrowed-down usage. This is why all these components already use DEFAULT_TIMEOUT_CONFIG as a default in their constructors.
  • We should be able to override the timeout configuration at the time of calling .send(). To do this, clients and dispatchers need to distinguish between "the user did not pass a value" and "the user explicitly doesn't want any timeout".

Right now .send() cannot make that distinction, but it can be achieved if we switch to UNSET as a default for timeout in .send() and its sub-methods (e.g. .connect() on HTTPConnection).

This issue is more subtle than I initially thought it would be and will require to go through various places of the code base. Still a nice first issue, but more challenging than usual, so I removed the label. :)

@florimondmanaca can you assign it to me please?

I don't understand the reason to use the sentinel value.

The only place we need to use an UNSET value is on the client .request(...) methods.
That's because we need to differentiate between two difference cases:

  • timeout is not set (Defer to whatever is set on the Client instance)
  • timeout=None (Don't timeout, regardless of whatever is set on the Client instance)

We don't need it on the client value, and we don't need it on the top-level API methods.

The only place we need to use an UNSET value is on the client .request(...) methods.
We don't need it on the client value, and we don't need it on the top-level API methods.

Except the top-level API methods default to None and pass that value along, making the top-level API default to no timeout. I don't think that's desired. And that doesn't help the fact that the dispatcher cannot currently distinguish between "the user did not pass a value" and "the user explicitly doesn't want any timeout", as Florimond said earlier. Doing this would leave the behavior inconsistent across the library just as now.

Or am I just getting my mind in a twist this morning? Or is this with the decision that the defaults in those places would be DEFAULT_TIMEOUT_CONFIG?

Ok, so if I got it right, in summary we want:

  • [x] Client instance: leave default DEFAULT_TIMEOUT_CONFIG
  • [ ] client .request(...) methods: change it to have the UNSET sentinel
  • [ ] Top-level API methods in httpx/api.py: change it to have default DEFAULT_TIMEOUT_CONFIG
  • [ ] .send() methods in clients and dispatchers: change it to have the UNSET sentinel

Is this correct?

@thebigmunch Ah yeah, not sure what I was thinking there. Easiest, most consistent then would be for the top-level ones to use the UNSET sentinal too, then. (Having them be DEFAULT_TIMEOUT_CONFIG would also work just fine, but I guess we may as well make them mirror their equivelent methods)

So, @jcugat I'd probably suggest the last three in that list should all be "use UNSET".

I think we've got that right, then?

What about the default in AsyncDispatcher.request and Dispatcher.request? What about the default here? I may be missing some spots; I just took a quick skim, so I don't have a good idea about their use and interaction with everything.

@tomchristie Considering #463 turned out to make the UNSET sentinel leak across the code base, I opened a bunch of PRs the to explore the option of using timeout=False to disable timeouts. They result in a rigorous consistency of what timeout=None/timeout=False means across the code base and for our users, which I find quite pleasing. They need to be reviewed/merged in order:

  • #491: Reorganization of timeout config tests -- preliminary cleanup.
  • #492: Always treat timeout=None as "use defaults".
  • #493: Always treat timeout=False as "don't timeout".

Anyone interested is much welcome to take a look at and drop comments on those PRs as well. :)

@florimondmanca

  • Having one boolean value be meaningful argument while the other is not is almost always a bad idea and will likely cause confusion. I use meaningful instead of valid here because: 1 == True. So, True is technically a valid input, just not in a way people would likely expect.
  • As I said before, timeout=False does not really read as what you'd expect for "no timeout" when compared to timeout=None.
  • While I don't necessarily feel httpx has to be beholden to requests' API or any other library, it should be considered that httpx would be changing the meaning of timeout=None as compared to requests and, IIRC, aiohttp. Can't say I've had enough experience with other http client libraries to determine if that's true of them or not.

I certainly didn't really like the way UNSET looked going throughout the library either. My only semi-reasonable suggestion for a builtin constant that would mean don't timeout would be ellipsis. The problem there is that most people don't actually know that exists. And any prior http client library usage would probably lead them to assume they should use None.

Another option could be to just set the default parameter values to 5, leaving timeout=None to mean no timeouts.

@thebigmunch Thanks for your comments.

Your first point makes me think — what if we switched from Noneto True as the default, and disallowed None altogether? Then, we'd be able to raise a nice and clean exception to our users:

httpx.get(url, timeout=True)  # Default behavior: apply default timeouts.
httpx.get(url, timeout=False)  # *Don't* apply timeouts.
httpx.get(url, timeout=None)  # ERROR: Please use 'timeout=False' to disable timeouts

I mean, if we decided to break backwards compatibility w.r.t. to timeouts (Requests applies no timeouts by default, but HTTPX does), it'd be nice to make it loud a clear in the code, right?

Another option could be to just set the default parameter values to 5, leaving timeout=None to mean no timeouts.

Applying DEFAULT_TIMEOUT_CONFIG internally whenever relevant still seems like a valid option, yes. @sethmlarson also suggested this in https://github.com/encode/httpx/pull/493#issuecomment-544265312. Upside: we'd be able to keep timeout=None as "no timeout". Downside: bunch of changes required to the code, but I guess we can live up to that. :)

Another option could be to just set the default parameter values to 5, leaving timeout=None to mean no timeouts.

Seems like a very sensible suggestion, right? Not sure why it took us such a winding road to get there. 😌

Seems like a very sensible suggestion, right? Not sure why it took us such a winding road to get there.

Sometimes its difficult to see the wood for the trees ^_^

So, unless I'm really missing something here, I think the change to timeout=5 is not as trivial as we'd like it to be. Consider this example:

import asyncio
import httpx


async def main(url: str) -> None:
    async with httpx.ConnectionPool(timeout=1e-2) as http:
        r = await http.request("GET", url)


asyncio.run(main("http://localhost:8000"))

I guess we're expecting timeout=1e-2 to propagate to the .request() call, right?

Except… .request() has its own timeout parameter, so defaulting it to timeout=5 too means we have no way to tell if a timeout was passed, or if we should fallback to self.timeout.

And this pattern is present in virtually the entire codebase: in .send(), .connect(), .read(), .write(), etc.

I understand the initial goal of having timeout parameters present everywhere is to be able to use small components of HTTPX independently — and I think Starlette has proved that this approach is very beneficial. But it seems to me having a well-defined timeout everywhere and allowing components to be used independently or as part of a wider system is surprisingly not easy to achieve.

Again, am I just missing something really obvious? 😕

I think this wouldn't be a problem if we were using **kwargs. We'd basically be able to use everywhere:

def send(self, **kwargs):
    timeout = kwargs.get("timeout", self.timeout)
    ...

But obviously **kwargs goes against our efforts of providing good editor and type checking support, so, yeah… At this point, I can't find of any other way to check for "this kwarg was not given" other than using a sentinal value. (Or it seems like we'd need to abandon per-call timeout config, e.g. on ConnectionPool and other dispatchers — and then maybe even Client, because how are we then supposed to pass the overidden timeout to the dispatcher? By setting its timeout attribute (provided it has one)? — which is obviously not acceptable.)

If anyone is up for tackling this (and give the problem another POV), I'd be super happy. I feel like I'm going crazy over the issues that arise when actually trying to make this change, and don't have the resources to dig any deeper right now unfortunately.

Again, am I just missing something really obvious?

I definitely mentioned it in regard to Client and its methods in your PR. I hadn't thought past that to see if it cropped up even deeper though.

Or it seems like we'd need to abandon per-call timeout config, e.g. on ConnectionPool and other dispatchers — and then maybe even Client, because how are we then supposed to pass the overidden timeout to the dispatcher?

In the prior art of requests, they quite specifically kept timeout out of Session because they wanted to keep a semantic separation between the session and the adapter layers (client vs dispatcher in httpx).

I think this wouldn't be a problem if we were using **kwargs

Again, probably the solution requests used for the same problem.

But obviously **kwargs goes against our efforts of providing good editor and type checking support, so, yeah…

This would also go against the decision to remove unsupported parameters for some of the methods.

If anyone is up for tackling this (and give the problem another POV), I'd be super happy. I feel like I'm going crazy over the issues that arise when actually trying to make this change, and don't have the resources to dig any deeper right now unfortunately.

Yeah, I hadn't yet dove deep enough to see the full extent of the rabbit hole... until just now. Ouch.

Was this page helpful?
0 / 5 - 0 ratings