I use httpx to request a kubernetes watch api. The api is based on HTTP chunked transfer encoding. When the server generates a new pod message, it will send to client.
Simplified code is as follows
import httpx
resp = httpx.get('http://127.0.0.1:18081/api/v1/watch/pods', stream=True)
for chunk in resp.stream():
pass
After 5 second, I get a ReadTimeout error
Traceback (most recent call last):
File "t.py", line 4, in <module>
for line in resp.stream():
File "/usr/local/lib/python3.6/site-packages/httpx/models.py", line 1077, in stream
for chunk in self.raw():
File "/usr/local/lib/python3.6/site-packages/httpx/models.py", line 1105, in raw
for part in self._raw_stream:
File "/usr/local/lib/python3.6/site-packages/httpx/concurrency/base.py", line 163, in iterate
yield self.run(async_iterator.__anext__)
File "/usr/local/lib/python3.6/site-packages/httpx/concurrency/asyncio.py", line 261, in run
return self.loop.run_until_complete(coroutine(*args, **kwargs))
File "/usr/lib64/python3.6/asyncio/base_events.py", line 484, in run_until_complete
return future.result()
File "/usr/local/lib/python3.6/site-packages/httpx/dispatch/http11.py", line 152, in _receive_response_data
event = await self._receive_event(timeout)
File "/usr/local/lib/python3.6/site-packages/httpx/dispatch/http11.py", line 174, in _receive_event
self.READ_NUM_BYTES, timeout, flag=self.timeout_flag
File "/usr/local/lib/python3.6/site-packages/httpx/concurrency/asyncio.py", line 83, in read
raise ReadTimeout() from None
httpx.exceptions.ReadTimeout
I change the timeout to 60 sec, and I will get RemoteProtocolError
Traceback (most recent call last):
File "t.py", line 4, in <module>
for chunk in resp.stream():
File "/usr/local/lib/python3.6/site-packages/httpx/models.py", line 1077, in stream
for chunk in self.raw():
File "/usr/local/lib/python3.6/site-packages/httpx/models.py", line 1105, in raw
for part in self._raw_stream:
File "/usr/local/lib/python3.6/site-packages/httpx/concurrency/base.py", line 163, in iterate
yield self.run(async_iterator.__anext__)
File "/usr/local/lib/python3.6/site-packages/httpx/concurrency/asyncio.py", line 261, in run
return self.loop.run_until_complete(coroutine(*args, **kwargs))
File "/usr/lib64/python3.6/asyncio/base_events.py", line 484, in run_until_complete
return future.result()
File "/usr/local/lib/python3.6/site-packages/httpx/dispatch/http11.py", line 152, in _receive_response_data
event = await self._receive_event(timeout)
File "/usr/local/lib/python3.6/site-packages/httpx/dispatch/http11.py", line 164, in _receive_event
event = self.h11_state.next_event()
File "/usr/local/lib/python3.6/site-packages/h11/_connection.py", line 420, in next_event
event = self._extract_next_receive_event()
File "/usr/local/lib/python3.6/site-packages/h11/_connection.py", line 369, in _extract_next_receive_event
event = self._reader.read_eof()
File "/usr/local/lib/python3.6/site-packages/h11/_readers.py", line 170, in read_eof
"peer closed connection without sending complete message body "
h11._util.RemoteProtocolError: peer closed connection without sending complete message body (incomplete chunked read)
This is a difference coming from requests I ran into as well. httpx sets a default timeout, so you'll have to add timeout=None to your request calls or client instance (if used) if you want no timeout.
It would probably be nice to add to the Timeouts docs as well as the requests compatibility docs when they get fleshed out. I have a feeling this one will come up a fair bit.
Thanks for the quick reply.
timeout=None is the default behavior of httpx.get
https://github.com/encode/httpx/blob/5ced56b5b55813bc38d571fdba6176842c0ec716/httpx/api.py#L68
And I see if the timeout is None, TCPStream.read will change it to DEFAULT_TIMEOUT_CONFIG
I don't find some way to control this behavior.
Ah, this seems like it might be a bug. Setting timeout=None on a client instance works properly. Setting timeout=None on any of the request calls, regardless if its through the top-level API or through a client method uses the default timeout.
Edit: That is, if you used:
import httpx
client = httpx.Client(timeout=None)
resp = client.get('http://127.0.0.1:18081/api/v1/watch/pods', stream=True)
for chunk in resp.stream():
pass
you should get the behavior you want for now.
There are actually three kinds of timeouts you can control in HTTPX: connect, read, and write. Unfortunately, the documentation doesn't show this ability to fine-tune timeouts yet (TimeoutConfig isn't even documented), and I agree it's something we need to address.
For your use case, it seems what you want is to modify the read timeout. (That way, you'll still timeout when connecting or when sending data, which is good because those are unexpected situations.)
The following should work:
import httpx
timeout = httpx.TimeoutConfig(connect_timeout=5, read_timeout=None, write_timeout=5)
resp = httpx.get('http://127.0.0.1:18081/api/v1/watch/pods', stream=True, timeout=timeout)
for chunk in resp.stream():
pass
Unfortunately there's no way to override only one of the timeouts while keeping default values for the others, so you need to explicitly set the connect and write timeouts. I reckon what we'd need is something like:
timeout = httpx.DEFAULT_TIMEOUT_CONFIG.replace(read_timeout=None)
but that doesn't exist yet.
Also — according to this SO thread you should expect the connection to drop anyway, so a long timeout along with an infinite loop that reconnects when losing the connection to the watch endpoint might be preferable:
def get_pod_events():
timeout = httpx.TimeoutConfig(connect_timeout=5, read_timeout=5 * 60 write_timeout=5)
while True:
resp = httpx.get('http://127.0.0.1:18081/api/v1/watch/pods', stream=True, timeout=timeout)
for event in resp.stream():
yield event
Ah, this seems like it might be a bug. Setting
timeout=Noneon a client instance works properly. Settingtimeout=Noneon any of therequestcalls, regardless if its through the top-level API or through a client method uses the default timeout.
I'm going to ping @tomchristie on this one — it might be the intended behavior for the high-level API. I think the rationale is that users should use a client for anything advanced, and disabling timeouts surely seems like an advanced action to make.
Replace None with TimeoutConfig(connect_timeout=5, read_timeout=None, write_timeout=5) and it works. Thanks!
disabling timeouts surely seems like an advanced action to make.
I don't think disabling timeouts should be considered anything more advanced than setting a different timeout. I'm certainly fine with there being a default timeout. But if a user can simply say httpx.get(..., timeout=10), then httpx.get(..., timeout=None should just work. I certainly wouldn't expect httpx to just flat-out ignore what is a perfectly valid input elsewhere (i.e. Client instances).
Edit: Implementation-wise, setting defaults on the methods and API functions might be a nicer way to do it. And None input should be converted to a TimeoutConfig just like a number input, shouldn't it?
A little confused about the meaning of None. In some api, timeout=None means using the default timeout, like
https://github.com/encode/httpx/blob/315a18b4cf81f95683e6afdc10dba3c75eb512b7/httpx/dispatch/connection.py#L69-L76
But in Client initialization, it means not set the timeout.
A little confused about the meaning of
None. In some api,timeout=Nonemeans using the defaulttimeout, like
That's the point I've brought up in a couple of my comments. Worse is:
import httpx
client = httpx.Client()
response = client.get(..., timeout=None)
also results in the call using the default timeout config instead of None. Which surely shouldn't be the case as we're using the "advanced" way of creating a client ourselves and setting a timeout to None, we just want no timeout for that specific call in this example. But it's not possible without manually constructing a TimeoutConfig which isn't necessary for any other value a user might be expected to give.
Hmm, yeah, seems like we should review the consistency of timeout behaviors across the various request entrypoints.
Right now, disabling timeouts via timeout=httpx.TimeoutConfig() should work in all cases.
Right now, disabling timeouts via
timeout=httpx.TimeoutConfig()should work in all cases.
Yeah, I forgot to qualify my statement in the previous comment. It's now qualified : P
And just so it has its own comment:
As @florimondmanca pointed out, if you set one of the specific timeout parameters, you would have to set the others if you didn't want them to be None. httpx should probably use the default timeout config values in these cases.
Certainly the behavior of Client(timeout=...) and .get(timeout=...) ought to match.
We can either:
timeout=None be used to indicate that no timeouts should be used. Switch the .get default value to a sential value such as USE_CLIENT_DEFAULT or UNSET.timeout=False be used to indicate that no timeouts should be used. Keep the .get default value as None (Tho explicitly settingtimeout=None will still have unexpected behavior, and we can't detect the difference between that and an "unset" value.)I definitely prefer the first option. timeout=False doesn't really make sense when read. And timeout=None is how it's spelled in requests.
I’m more inclined towards timeout=None with a sentinel object (to be used whenever the timeout config is replaced by the default config, eg in Stream.read) as well.
Fine with that too, yup.
Opened https://github.com/encode/httpx/issues/433 based on the discussions here. I think the original issue here was dealt with, so I'm closing this one!
Note, TimeoutConfig is now simply called Timeout.
Most helpful comment
There are actually three kinds of timeouts you can control in HTTPX: connect, read, and write. Unfortunately, the documentation doesn't show this ability to fine-tune timeouts yet (
TimeoutConfigisn't even documented), and I agree it's something we need to address.For your use case, it seems what you want is to modify the read timeout. (That way, you'll still timeout when connecting or when sending data, which is good because those are unexpected situations.)
The following should work:
Unfortunately there's no way to override only one of the timeouts while keeping default values for the others, so you need to explicitly set the connect and write timeouts. I reckon what we'd need is something like:
but that doesn't exist yet.
Also — according to this SO thread you should expect the connection to drop anyway, so a long timeout along with an infinite loop that reconnects when losing the connection to the watch endpoint might be preferable: