Httpx: Transport API

Created on 16 Jan 2020 · 9Comments · Source: encode/httpx

What is the dispatcher API, and why is it useful?...

The dispatcher is the part of the system that's responsible for actually sending the request and returning a response to the client. Having an API to override the dispatcher used by the client allows you to take complete control over that process, and switch out the default behaviour with something else.

For instance the "plug in to a WSGI app" and "plug in to an ASGI app" cases that we support are implemented as dispatcher classes, that send the request to a Python web application, rather than sending a network request.

Currently the dispatchers are considered private API, but really we'd like a public API here at some point. This issue is for discussing what we think that API should be.

Right now the API is...

 def send(request: Request, timeout: Timeout = None) -> Response:
    ...

def close():
    ...

Withasync equivelents for the async case.

That's actually pretty okay, but if we're looking at a public API, we ought to figure out if it's actually what we want to expose. There's a case to be made for making the dispatcher API just take plain primitive datastructures, rather than dealing with Request/Response models. That'd make for a more portable interface, that could be used in other packages without importing httpx.

That might look something like this...

def send(self, method: bytes, url: Typle[bytes, bytes, int, bytes], headers: List[Tuple[bytes, bytes]], body: ClosableIterator[bytes] = None, timeout: Timeout = None):
    ...
    return (status_code: int, http_version: bytes, reason_phrase: bytes, headers: List[Tuple[bytes, bytes]], body: ClosableIterator[bytes])

def close():
    ...

Both the request and response bodies would be byte-iterable, closable interfaces (or byte async-iterable, closable interfaces). Everything else would be plain ol' Python datastructures.

Thoughts:

It's feasible that url could be defined as a tuple or url components instead, so that we can keep pass a parsed url objects from the client to a dispatcher interface? **Update: I've tweaked the interface to describe it that way around, as "scheme", "host", "port", "full_path".
We do need to have the timeout be part of the dispatch interface, since we want to be able to support configured-per-request timeout values - I'd guess we'd need to define that as an optional four-tuple of connect/read/write/pool float timeout values.
We might want to think about defining expected exception types.
There's no support for HTTP trailing headers in this interface. Perhaps we don't care about that, or could the ContentStream.close() perhaps optionally return a list of headers, maybe?

Something that'd potentially be interesting about defining this interface in a really strict, primitive datastructures way, would be that we could pull out our dispatcher implementations into a really tightly scoped package httpcore, that'd just provide the plain network dispatchers, without any client smarts around them. (Interesting amongst other things because it'd provide a nice just-the-network-backend for other client libs to potentially use too / switch too.)

I think a useful first take onto all of this might? be an exploratory pull request that switches us over to using a plainer dispatch API, just to get a feel for how that'd look. A very first initial take on that could still use the URL, ContentStream, and Timeout primitives.

(This also potentially ties in with whatever the hip team's plans are. For instance, right now we're using our own dispatcher implementation - currently in the async case only, but planned for both cases - but we would have the option of using the hip backend, once it's available. It's not obvious to me if we'd want to do that, or if we'd prefer to continue with our existing backend, which is looking pretty nice now, but it could be a point of collaboration, or something to discuss.)

Alternately, we might decide that this falls into architecture astronauting territory, and that the existing API is "just fine thankyouverymuch".

discussion

Source

tomchristie

👀2

Most helpful comment

Gives me warm fuzzies how tightly defined our low-level transport API is. 👍💚

tomchristie on 21 May 2020

🚀2

All 9 comments

I think it's probably a good time for us to start pushing ahead with this, and ensuring that we've got a nice clean interface split between everything in the dispatcher implementation vs. everything at the client level.

Here's a plan towards how we could approach that...

Drop cert, verify, trust_env from ConnectionPool, and configure using a plain ssl_context: SSLContext = None at that level. (The client would pass ssl_config.ssl_context when instantiating the ConnectionPool)
Drop pool_limits from ConnectionPool, and instead provide max_keepalive and max_connections configuration at that level. (The client would pass pool_limits.soft_limit/pool_limits.hard_limit when instantiating the ConnectionPool) - Related to that we might want to rename soft_limit/hard_limit?
Switch the dispatcher API to a plain no-models style, as described here.

At that point we've got both the Dispatcher API, and the connection pool init nicely decoupled from any model/httpx-config-classes info, and it'd probably be a good point to tackle our unasync'ed sync connection pool implementation, that'd allow us to drop urllib3 as the sync default.

I think that a neat tack at that point might be to spin up a completely seperate httpcore repo, containing just the dispatch and backends, and the tests for those. We wouldn't include the ASGI, WSGI, and urllib3 dispatch classes in there either since we won't be running unasync on those. Then...

Get the tests & CI all passing cleanly on the httpcore repo.
Add unasync into the mix, to provide a SyncConnectionPool implementation.
Document that package up as a low-level no-models, no-redirects, no-cookie-handling, no-client-smarts HTTP implementation.
Switch our async dispatcher implementations over to use httpcore.AsyncConnectionPool, and httpcore.AsyncHTTPProxy.
Switch our sync dispatcher implementation over to use httpcore.SyncConnectionPool and httpcore.SyncHTTPProxy. (With our URLLib3Dispatcher() class remaining as an optional alternative)

One possible outcome that designing out a really strict interface could potentially have here would be to put us in a better position to be able to collaborate with the hip team, since we've got a nice clear options of "switch to a hip-based backend" or "switch to a hip-based backend for HTTP/1.1 only, or httpcore if HTTP/2 is enabled". Not clear if that's feasible, but at least it gives us options.

I'm not absolutely sold on the neccessity of pulling the dispatch out into a seperate package, but...

I think the initial steps of switching to a no-models init and dispatch API are worth us doing anyways.
It'd be quite nice to have all the unasync and "actually make a network request" stuff partitioned into one place, with all the client smarts and model APIs on top of that be completely independant.
We could potentially treat the httpcore work as exploratory and take a view on "just include this all in the same package" vs. "enforced split" once we've done the unasync work.

tomchristie on 30 Jan 2020

❤1

There's other potentially productive points of collaboration here too, since httpcore would be a nice minimal dependency for other client libs to use for their core networking too.

One good example here would be adding a requests adapter class using httpcore. https://github.com/psf/requests/blob/master/requests/adapters.py#L55

tomchristie on 30 Jan 2020

a completely seperate httpcore repo

I can't express how excited I am about this.

We've been talking about trying to draw the line between "low-level/network/just-make-an-HTTP-request stuff" vs "client-with-smarts stuff" for some time, including reaching out for comments in https://github.com/python-http/discussions/issues/9.

So the idea of "let's just try to make it happen" by experimenting from own our code is exciting because it gives a concrete view on how things could be organized.

florimondmanca on 30 Jan 2020

🎉1

I can't express how excited I am about this.

Ace! 🌟😃🌟

I've created https://github.com/encode/httpcore - for my part I'll probably start by drawing up the documentation for it. We essentially know what the interface will look like.

(Also, we'll repurpose the httpcore PyPI project which httpx was under while it was still in its earliest days of devlopement.)

tomchristie on 30 Jan 2020

Tangentially related to prior renames/rebrands, the https://pypi.org/project/requests-async/ and https://pypi.org/project/http3/ projects should have their documentation updated to point to https://pypi.org/project/httpx/ / https://www.python-httpx.org/

StephenBrown2 on 30 Jan 2020

Here we go... 👉 https://www.encode.io/httpcore/

tomchristie on 30 Jan 2020

❤2

@StephenBrown2 Yup, good point.

tomchristie on 30 Jan 2020

Things I think we need to do for 1.0...

Move to transport=..., with dispatch=... placed on the deprecation path, right?
Include a "Custom Transport classes" section in the advanced docs, which:
- Outlines the transport API.
- Provides an example of instantiating a transport directly, eg. transport = httpcore.SyncConnectionPool(...). to help users see how that fits together.
- Provide an example with transport = httpx.URLLib3Transport(...).

We might also need to think about a "mount" API, and/or how to document proxy transport classes alongside this.

tomchristie on 13 May 2020

Gives me warm fuzzies how tightly defined our low-level transport API is. 👍💚

tomchristie on 21 May 2020

🚀2

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Private module names

tomchristie · 3Comments

More explicit documentation that `httpx.Client` is equivelent to `requests.Session`.

innawe · 3Comments

Cache NETRC logins after first lookup

florimondmanca · 4Comments

Passing AsyncClient from an 'async with' context behaves unexpectedly

StevenBorg · 3Comments

Test case `test_write_timeout` fails

kde713 · 3Comments