Httpx: Retry requests

Created on 25 Jun 2019 · 24Comments · Source: encode/httpx

urllib3 has a very convenient Retry utility (docs) that I have found to be quite useful when dealing with flaky APIs. http3's Clients don't support this sort of thing yet, but I would love it if they did!

In the meantime, I can probably work out my own with a while loop checking the response code.

Source

StephenBrown2

Most helpful comment

Not documented in the QuickStart or Advanced guides there, but it is part of the API reference. https://2.python-requests.org/en/master/api/#requests.adapters.HTTPAdapter

So, sure, let's treat that as in-scope.

tomchristie on 26 Jun 2019

👍4 🎉1

All 24 comments

I think our tack onto this should probably be to match requests API wherever possible (so no built-in retry functionality). What we should defiantly do tho would be to do a good job of documenting how to implement this or other functionality with a custom dispatcher. That way we can make sure we're allowing for a range of differing behaviors, without having to extend our API surface area.

I'm not absolute on this one, so perhaps could reassess it at a later date, but for the time being I'd like to treat it as out of scope.

tomchristie on 26 Jun 2019

I'm actually of the opposite opinion here. Every requests usage I've seen in the wild uses urllib3's Retry, let's not repeat that lapse of API. :)

sethmlarson on 26 Jun 2019

Oh right, I was under the impression that it disables it, but I must be misremembering?

tomchristie on 26 Jun 2019

Okay, looks like it's disabled by default, but can be enabled in the HTTPAdapter API... https://github.com/kennethreitz/requests/blob/master/requests/adapters.py#L113

tomchristie on 26 Jun 2019

Not documented in the QuickStart or Advanced guides there, but it is part of the API reference. https://2.python-requests.org/en/master/api/#requests.adapters.HTTPAdapter

So, sure, let's treat that as in-scope.

tomchristie on 26 Jun 2019

👍4 🎉1

I have used backoff atm, but it would be nice to have some native solution 👍

Hellowlol on 26 Jun 2019

Should the urllib3 team be pinged on implementation details? I could potentially work on porting it straight over, but I'm unsure about where it would lie in this library, and if someone else knows better on how it would interact with Sync vs Async they would probably be better to implement.

StephenBrown2 on 9 Jul 2019

I'd love to be put on the review if you're willing to take a stab at implementation. :) No worries on getting it right the first time.

sethmlarson on 9 Jul 2019

Also want to give attribution where it's due, so would probably start with a copy and reference comment, then work on async-ifying it... Will put some effort into it over the next couple weeks. I don't have much free time, so I'm not opposed to duplicate work on it.

StephenBrown2 on 9 Jul 2019

It might be worth tackling it API-first, before jumping in on the implementation.

How many controls and dials do we really want the retry API to provide? What critical bits of control over that are actually needed/used in the wild? Whatever is the most minimal possible sounds good to me - we can always extend it out further in time if needed.

We may also want to think about allowing for more complex cases through customization rather than configuration. Eg. only provide for specifying "allow N retries", but also provide an implementation hook that folks can build more complex cases off of.

That way you keep the API surface area nice and low, while still allowing flexiblity, or third party packages.

tomchristie on 9 Jul 2019

👍1

I'll start by enumerating urllib3's existing functionality for Retry:

read Number of times that a read operation can fail
write Number of times that a write operation can fail
connect Number of times that an operation related to creating a connection can fail (Doesn't apply to read timeouts on TLS, those are under read)
redirect Number of times that a redirect can be followed
status Number of times that we can retry a request based on the received responses status code being in the status_forcelist and the request method being in method_whitelist.
Exponential backoff
raise_on_redirect and raise_on_status returns whether or not we should raise an error or just return the response.
respect_retry_after is whether a retry based on a response should respect he Retry-After header by sleeping.
remove_headers_on_redirect is a set of headers that should be removed from subsequent requests when a redirect is issued.

sethmlarson on 9 Jul 2019

IMO the raise_on_redirect and raise_on_status are things that don't need to be attached to the Retry class?

sethmlarson on 9 Jul 2019

Here's a typed skeleton of the current Retry object's public interface:
https://gist.github.com/StephenBrown2/2fc6bab18b30037488deb0f4db92e001

StephenBrown2 on 9 Jul 2019

So I definitely want to make the new Retry object have one critical feature that actually makes sub-classing useful: Have a method that is subclassable that passes in the Request and the Response and then by some interface is allowed to decide whether a retry occurs and in addition also allows modifying the Request that will be emitted for that retry.

This exact functionality is tough to implement in urllib3 because we don't allow you to modify headers and we don't give the user everything they might want to know (such as the whole request that was sent)

Doing this allows so many things to be implemented simply by sub-classing the Retry. It's pretty critical actually that we get this interface right because there's a lot of functionality related to HTTP that involves multiple requests (Authentication, stripping headers on redirect, retry-after, resumable uploads, caching) that users have been panging for but is tough to implement without the right interface.

sethmlarson on 9 Jul 2019

I'd rather Retry was an ABC and could be implemented without subclassing

graingert on 10 Jul 2019

Could you explain the benefit of not needing to implement with sub-classing in this situation? I'm not seeing one right away.

sethmlarson on 10 Jul 2019

There’s some great stuff to dig into here, tho I don’t think it’s on the critical path.

The API will be retries=int|RetryConfig available either per-client or per-request.

Finer grained control and/or a method override will then exist on the RetryConfig.

tomchristie on 10 Jul 2019

Personally, I don't see a need to support the int option, as all the options could have good defaults and one can simply use retries=Retries() to get those.

What I would like to see, as the minimum for my needs right now, is:

total: int, Need this to be an overall limit, read/write specificity not needed, but I can see it would be useful as reads would be retryable much more than writes, though that can be handled with...
- Expected default: 3
status_codes: set, A set of status codes (Or HTTPStatus values?) to retry on, as the API I'm working with has no real concept of the difference between GET and POST, so also good would be...
- Expected default: frozenset({429, 503})
http_methods: set, A set of HTTP methods that can be retried, would take care of the read/write specificity as mentioned above.
- Expected default: frozenset({"HEAD", "GET", "PUT", "DELETE", "OPTIONS", "TRACE"})
sleep/backoff: int, I would assume respect_retry_after to be True by default, and specifying the sleep option would be a constant time to wait between retries if there is no Retry-After header, while backoff would be an exponential backoff factor.
- Expected default: 0

https://pypi.org/project/backoff/ as mentioned by @Hellowlol might be something to look at for inspiration as well.

StephenBrown2 on 10 Jul 2019

Noting here that having respect_retry_after on by default can cause a DoS attack on the HTTP client by setting excessive wait times. If it's on by default we need a reasonably short max_retry_after value.

And I'm of the opinion that having great defaults for the RetryConfig makes having an int -> total being acceptable being ever more desirable rather than less.

sethmlarson on 10 Jul 2019

And I'm of the opinion that having great defaults for the RetryConfig makes having an int -> total being acceptable being ever more desirable rather than less.

100%. No reason folks should have to think about this.

tomchristie on 10 Jul 2019

Noting here that having respect_retry_after on by default can cause a DoS attack on the HTTP client by setting excessive wait times. If it's on by default we need a reasonably short max_retry_after value.

And I'm of the opinion that having great defaults for the RetryConfig makes having an int -> total being acceptable being ever more desirable rather than less.

How about just a max_retry_after which can be set to 0 to not respect Retry-After

graingert on 10 Jul 2019

I'm okay with having max_retry_after being zero mean we don't respect Retry-After and I guess None being no limit? We just have to be careful designing APIs where specific values have special meanings because it makes extending those APIs tougher and more confusing for the user and makes code using our library harder to read. In this case I think that Retry-After is very well defined and unlikely to be extended so can be treated this way.

sethmlarson on 10 Jul 2019

Since 0 is Falsey, that would work, though I might prefer an explicit False being documented, the check would remain the same if respect_retry_after and max_retry_after: or similar.

StephenBrown2 on 11 Jul 2019

Per #134 we're going to hold off on implementing this feature.

sethmlarson on 31 Jul 2019

Was this page helpful?

0 / 5 - 0 ratings

Related issues

AttributeError: module 'rfc3986.api' has no attribute 'iri_reference'

szelenka · 4Comments

Question: Enforce HTTP version for particular request

paulchubatyy · 4Comments

Test case `test_write_timeout` fails

kde713 · 3Comments

Supporting `Response(content=..., text=..., html=..., json=...)`

tomchristie · 3Comments

Ability to use httpx.AsyncClient for ASGI testing without persisting cookies or open/closing it

simonw · 3Comments