Currently httpx is squarely focused on HTTP's traditional request / response paradigm, and there are well-established packages for WebSocket support such as websockets. In an HTTP/1.1-only world, this split of responsabilities makes perfect sense as HTTP requests / WebSockets work independently.
However, with HTTP/2 already widely deployed and HTTP/3 standardisation well under way I'm not sure the model holds up:
websockets are usually tied to HTTP/1.1 only, whereas httpx has support for HTTP/2 (and hopefully soon HTTP/3)Using the sans-IO wsproto combined with httpx's connection management we could provide WebSocket support spanning HTTP/1.1, HTTP/2 and HTTP/3. What are your thoughts on this?
One caveat: providing WebSocket support would only make sense using the AsyncClient interface.
So my assumption here would be "no", but that it's possible we'll want to expose just enough API in order to allow a websocket implementation to use httpx for handshaking the connection, and sharing the connection pool.
So, to the extent that we might allow it to be possible, I'd expect that to be in the form of a third party package, that has httpx as a dependency.
I'm not sure exactly what API that'd imply that we'd need to support, but it's feasible that we might end up wanting to expose some low-level connection API specifically for supporting this kind of use case. The most sensible tack onto thrashing out what we'd need there would be a proof-of-concept implementation that'd help us concretely indentify how much API we'd need to make that possible.
Does all that make sense, or am I off on the wrong track here?
What you are saying does make sense, I haven't quite made up my own mind on this.
Pros of a third-party package handling the websockets:
Cons:
Interesting points, yeah. I guess I'd be open to reassessing this as we go.
We'd need to identify what the API would look like, both at the top user-facing level, and at whatever new lower-level cutoffs we'd need to be introducing.
I'm in favor of adding websockets definitely, would be a good feature to work towards in a v2? Still so much to do to get a v1 released.
Thanks for bringing this up @jlaine! I've not had to deal with websockets so I didn't even think they'd be at home in a library like HTTPX. :)
I'm in favor of adding websockets definitely, would be a good feature to work towards in a v2? Still so much to do to get a v1 released.
Probably not a bad call, yeah.
I'm in favor of adding websockets definitely, would be a good feature to work towards in a v2? Still so much to do to get a v1 released.
I'll see if I can find time to put together a minimal PR supporting HTTP/1.1 and HTTP/2 to scope out what this entails. Obviously if there are more pressing needs for v1 there is zero pressure to merge it.
@aaugustin just pinging you so you're in the loop : this is not a hostile move against websockets, and your insights would be most welcome
I'm currently working on refactoring websockets to provide a sans-I/O layer. If it's ready within a reasonable timeframe, perhaps you could use it. It will handle correctly a few things that wsproto doesn't, if the comments I'm seeing in wsproto's source code are correct.
One of my issues with sans-I/O is the prohibition async / await, making it impossible to provide a high-level API. For this reason, I'm not sure that sans-I/O as currently specified is the best approach. This makes me hesitate on a bunch of API design questions...
Building an API that httpx will consume (either strict sans-I/O or something else) would be a great step forwards. Perhaps we can cooperate on this?
Also I would really like websockets to work over HTTP/2 and HTTP/3.
I'm quite happy with my implementation of a subset of HTTP/1.1 — I'm not kidding, even though it's home-grown, I honestly think it's all right. It may be possible to achieve HTTP/2 in the same way but it will be partial e.g. it will be impossible to multiplex websockets and HTTP on the same connection. Given what I've seen of aioquic, I don't think it's reasonable to do HTTP/3 that way.
So I'm interested in working with third-party packages that handle HTTP/2 and HTTP/3 to figure out what that entails.
As for me, httpx is not only the next generation HTTP client for requests but also aiohttp.
So expecting a new choice to take the place of aiohttp's websocket. https://github.com/ftobia/aiohttp-websockets-example/blob/master/client.py
Best wishes to the author, this lib is a great help for me.
PS: aiohttp is good enough, except some frequently changing api protocol (like variable names), which raised a backward incompatibility error by a new version.
FYI I started implementing a Sans-I/O layer for websockets: https://github.com/aaugustin/websockets/issues/466
It may not be obvious from the first commits because I have to untangle a lot of stuff from asyncio before I can get anything done :-)
@aaugustin That's awesome news! :) Thanks for updating this thread. It'd be interesting to see how an upgrade from h11 into the websockets state-machine might look. Something to definitely write up a POC for that little exchange.
(Also, congrats on the Tidelift sponsorship!)
So, here's how I think the bridge could work.
Since we're talking about httpx, I'm focusing on the client side here, but the same logic would apply on the server side.
In addition to the regular APIs that receive and send bytes, websockets should provide an API to receive and send already parsed HTTP headers. This API will support bolting websockets on top of any HTTP/1.1 implementation.
In addition to:
from websockets import ClientConnection
ws = ClientConnection()
bytes_to_send = ws.connect()
...
ws.receive_data(bytes_received)
I need to provide something like:
from websockets import ClientConnection
ws = ClientConnection()
request = ws.build_handshake_request()
# here request is (path, headers) tuple
bytes_to_send = serialize(request) # <-- you can do this with any library you want
...
response = parse(bytes_received) # <-- you can do this with any library you want
# here response is a (status_code, reason_phrase, headers) tuple
ws.receive_handshake_response()
The latter happens under the hood anyway so it's clearly doable.
It's "just" a matter of naming things :-) — which I care deeply about.
WebSocket over HTTP/2 requires a different handshake, so websockets will need another APIs to support it. I have no idea about WebSocket over HTTP/3.
WebSockets over HTTP/3 haven't been officially specified, but it is extremely likely it will work like for HTTP/2 (RFC 8441), namely using a :protocol pseudo-header. This is what I have assumed for the aioquic demo client + server, and I believe @pgjones did the same for hypercorn.
I'd like us to treat this as out-of-scope at this point in time.
Yes I think we'll want to provide enough API to make this do-able at some point in the not too distant future, but being able to do that while still aiming for a sqaured away API stable 1.0 release isn't something we're able to do just yet.
I'd recommend to keep this open and add a "deferred" label.
It might also be worth tweaking the AsyncClient documentation; currently, it says:
Async is a concurrency model that is far more efficient than multi-threading, and can provide significant performance benefits and enable the use of long-lived network connections such as WebSockets.
and indeed httpx pops up as one of the first results on Google for python websockets ^^
So, I've been thinking a bit about how we might support websockets and other upgrade protocols from httpx.
This is more of a rough idea, than a formal plan at the moment, but essentially what we'll want is for the httpcore layer to support another method in addition to the existing .request(). So that we have...
.request(<request>) -> <response>
.upgrade(<request>, <protocol string>) -> <response>, <socket stream>
Which will return an object implementing our (currently internal only) SocketStream API
Once we're happy that we've got the low-level httpcore API for that thrashed out, we'd expose it higher up into httpx with something like this...
# Using a `client` instance here, but we might also support a plain `httpx.upgrade(...)` function.
# Both sync + async variants can be provided here.
with client.upgrade(url, "websocket") as connection:
# Expose the fundamental response info...
connection.url, connection.headers, ...
# Also expose the SocketStream primitives...
connection.read(...), connection.write(), ...
With HTTP/2, the socket stream would actually wrap up an AsyncSocketStream/SyncSocketStream together with the stream ID, and handle the data framing transparently.
Protocol libraries wouldn't typically expose this interface themselves, but rather would expose whatever API is appropriate for their own cases, using httpx in their implementation to establish the initial connection. They might provide a mechanism for passing an existing http.Client/httpx.AsyncClient instance to their library for users who want to eg. share WebSockets over the same connections as their HTTP requests are being handled.
Nice stuff this would give us...
trio+asyncio+whatever, without protocol implementations having to do any cleverness on their side.We don't necessarily want to rush trying to get this done, but @aaugustin's work on https://github.com/aaugustin/websockets/issues/466 has gotten me back to thinking about it, and it feels like quite an exciting prospect.
Does this seem like a reasonable way forward here?...
/cc @pgjones @florimondmanca @jlaine @aaugustin @sethmlarson
The Sans I/O layer in websockets is (AFAIK) feature complete with full test coverage.
However, websockets doesn't uses it yet and I haven't written the documentation yet.
If you have a preference between:
I can priorize my efforts accordingly.
Also, I don't expect you to use the HTTP/1.1 implementation in websockets, only the handshake implementation (which contains the extensions negotiation logic, and you don't want to rewrite that). You will want to work with request / response abstractions serialized / parsed by httpx rather than websockets. At some point, making this possible was a design goal. I don't swear it is possible right now :-) We can investigate together the cleanest way to achieve this.
In my view supporting sending requests over the same connection as the WebSocket for HTTP/2 is a key feature. As an isolated connection may as well start with the HTTP/1-Handshake and avoid the HTTP/2 extra complexity.
I've not followed the httpcore/httpx structure so I can't comment on the details, sorry.
I'll also make the case for wsproto, which supports WebSockets, HTTP/1-Handshakes, and HTTP/2-Handshakes. It also works very happily with h11, and h2. It is also now typed and I would say stable - indeed we've been discussing a 1.0 release.
wsproto is perfectly fine as well :-)
Thanks folks - not going to act on any of this just yet since it's clearly a post 1.0 issue.
In the first pass of this we'd simply be exposing a completely agnostic connection upgrade API, which would allow third party packages to build whatever websocket implementations they want, piggybacking on top of httpx HTTP/1.1 and HTTP/2 connections.
We could potentially also consider built-in websocket support at that point, but let's talk about that a little way down the road.
So my initial take on this is that we'd want to first expose a connection upgrade API, and then built websockets support over that, so something like...
def request(...) -> ... # Our standard request/response interface at the Transport API level.
def connect(...) -> Connection # New method at the Transport API level, exposing raw connection capabilities.
Then at the client level, have websocket support, that uses connect under the hood.
async with client.websocket(...) as ws:
await ws.send(...)
message = await ws.receive()
There's a couple of potential awkwardnesses about that tho...
httpx, rather than having it in httpcore.ASGITransport would need to de-marshall the bytes on the connection back into websocket events, rather than "seeing" an API abstraction that's at the level of the events themselves.So, after a bit more thinking, I reckon we should instead aim to provide a Transport-level interface specifically for websockets.
def request(...) -> ... # Our standard request/response interface at the Transport API level.
def websocket(...) -> WebSocket # New method at the Transport API level, exposing websocket capabilities.
This wouldn't preclude us possibly also adding a raw "just give me the connection" level abstraction too at some other point, in order to provide for low-level CONNECT, Upgrade and HTTP/2 bi-directional streaming support. But we could treat that as a lower priority, depending on if we're actually seeing any demand/use-cases for exposing those capabilities.
def request(...) -> ... # Our standard request/response interface at the Transport API level.
def websocket(...) -> WebSocket # New method at the Transport API level, exposing websocket capabilities.
def connect(...) -> Connection # New method at the Transport API level, exposing raw CONNECT/Upgrade/HTTP2 data streaming capabilities.
The information returned from the low-level websocket method would need to be all the standard response code/headers stuff, plus an interface that just exposes something like send_event(), receive_event() and close().
There's also the question of what the API ought to look like at the httpx level.
One thing that tends to be a bit fiddly here is that websockets can return either bytes or str frames, but we'd still like our users to be able to access the data in a nice type-checked fashion. Rather than expose multiple kinds of send/receive methods, we might be able to do with just these, by...
ws.send(text=...) to distinguish on sending.msg = ws.receive(); print(msg.text) to distinguish on receiving. The .text property could strictly return str, and could raise an error if the received data frame was actually in the unexpected binary mode.json argument in both case. We could default to handling JSON over text frames. Optionally we might include a flag on the client.websocket() method setting the expected mode to either text or binary, and using that for the JSON framing, plus erroring out if the incorrect style is sent/received anytime.def send(*, text: str = None, content: bytes = None, json: typing.Any = None)
def receive() # Returns an object with `.text`, `.content`, `.json`, `.payload [str|bytes]`
def close()
We can transparently deal with ping/pong frames during any .send()/receive(), and context managed async transports could also send background pings.
We might well also want .iter_bytes(), .iter_text(), and .iter_json() methods, for convenience.
I'm aware this is all in a bit of a "jotting down" style, please do feel free to let me know if I'm not being clear enough about anything here.
Most helpful comment
I'm currently working on refactoring websockets to provide a sans-I/O layer. If it's ready within a reasonable timeframe, perhaps you could use it. It will handle correctly a few things that wsproto doesn't, if the comments I'm seeing in wsproto's source code are correct.
One of my issues with sans-I/O is the prohibition async / await, making it impossible to provide a high-level API. For this reason, I'm not sure that sans-I/O as currently specified is the best approach. This makes me hesitate on a bunch of API design questions...
Building an API that httpx will consume (either strict sans-I/O or something else) would be a great step forwards. Perhaps we can cooperate on this?
Also I would really like
websocketsto work over HTTP/2 and HTTP/3.I'm quite happy with my implementation of a subset of HTTP/1.1 — I'm not kidding, even though it's home-grown, I honestly think it's all right. It may be possible to achieve HTTP/2 in the same way but it will be partial e.g. it will be impossible to multiplex websockets and HTTP on the same connection. Given what I've seen of aioquic, I don't think it's reasonable to do HTTP/3 that way.
So I'm interested in working with third-party packages that handle HTTP/2 and HTTP/3 to figure out what that entails.