Requests: Response history is not available in response hooks during redirections

Created on 30 Aug 2020  路  6Comments  路  Source: psf/requests

Response history is not available in response hooks during multiple redirections.
For example, if a GET-request gets redirected 5 times, the response history stays empty when examined from inside a response hook after each redirection. However the response-object that get() returns will contain the full history.

In some cases it would be useful to know, for example, what the initial requested URL was from inside a hook. However, currently I think it's not possible due to the lack of history.

Expected Result

Up-to-date history available in response hook after each redirection.

Hook history for https://ca.fi/r1.php: []
Hook history for https://ca.fi/r2.php: [<Response [302]>]
Hook history for https://ca.fi/r3.php: [<Response [302]>, <Response [302]>]
Hook history for https://ca.fi/r4.php: [<Response [302]>, <Response [302]>, <Response [302]>]
Response history for https://ca.fi/r4.php: [<Response [302]>, <Response [302]>, <Response [302]>]

Actual Result

Response history is empty when examined from response hooks.

Hook history for https://ca.fi/r1.php: []
Hook history for https://ca.fi/r2.php: []
Hook history for https://ca.fi/r3.php: []
Hook history for https://ca.fi/r4.php: []
Response history for https://ca.fi/r4.php: [<Response [302]>, <Response [302]>, <Response [302]>]

Reproduction Steps

import requests

def print_resp_hist(r, *args, **kwargs):
    print("Hook history for %s: %s" % (r.url, r.history))
    return r

r = requests.get(
    'https://ca.fi/r1.php',
    hooks={'response': [print_resp_hist]},
    allow_redirects = True
)

print("Response history for %s: %s" % (r.url, r.history))

System Information

$ python -m requests.help
{
  "chardet": {
    "version": "3.0.4"
  },
  "cryptography": {
    "version": "3.0"
  },
  "idna": {
    "version": "2.10"
  },
  "implementation": {
    "name": "CPython",
    "version": "3.8.2"
  },
  "platform": {
    "release": "5.4.0-42-generic",
    "system": "Linux"
  },
  "pyOpenSSL": {
    "openssl_version": "1010107f",
    "version": "19.1.0"
  },
  "requests": {
    "version": "2.24.0"
  },
  "system_ssl": {
    "version": "1010106f"
  },
  "urllib3": {
    "version": "1.25.10"
  },
  "using_pyopenssl": true
}

Most helpful comment

Your hook needs only be callable.

class Hook:
    def __init__(self):
        self.urls = []

    def __call__(self, resposne):
        self.urls.append(response.request.url)


requests.get(url, hooks={"response": [Hook()]}, ...)

Should do what you want as a basic skeleton

All 6 comments

This is undocumented but has always been the behaviour. Each part of the history only has its own history because that is created at the very end before returning from the method that handles the request. In short, your expectation is not something that was documented as working, was never intended to work that way, and would require more resources than are available to make a reality.

@sigmavirus24, thanks for the answer.
Do you know if there is any way or workaround to get the initial URL from the first request to the subsequent hooks?

Your hook needs only be callable.

class Hook:
    def __init__(self):
        self.urls = []

    def __call__(self, resposne):
        self.urls.append(response.request.url)


requests.get(url, hooks={"response": [Hook()]}, ...)

Should do what you want as a basic skeleton

I managed to resolve this with the following hotfix to sessions.py:

--- sessions.py 2020-08-31 14:16:56.530230745 +0000
+++ sessions.py 2020-08-31 14:18:18.359703713 +0000
@@ -234,6 +234,9 @@
                 yield req
             else:

+                hooks = prepared_request.hooks
+                prepared_request.hooks = []
+
                 resp = self.send(
                     req,
                     stream=stream,
@@ -245,6 +248,10 @@
                     **adapter_kwargs
                 )

+                resp.history = hist
+                resp = dispatch_hook('response', hooks, resp)
+                prepared_request.hooks = hooks
+
                 extract_cookies_to_jar(self.cookies, prepared_request, resp.raw)

                 # extract redirect url, if any, for the next loop

Probably not the cleanest way to implement this but seems to work atleast for my usecase.

That looks to be in resolve_redirects so you're breaking history and not returning accurate history for each part of the redirect cycle. You may find it does near-enough to what you want, but I'd strongly discourage others from using that patch.

I seem to be getting exactly equal redirect history in the final returned response with and without the patch. As for the history inside hooks, it also seems to match properly the redirects.

But yeah, that doesn't change the fact that this is an ugly hack and should not be used by anyone without actually making sure that nothing gets unexpectedly broken.

Was this page helpful?
0 / 5 - 0 ratings