Describe the bug
Running a read intense application and at some point of time, a time out happens while the application is trying to read from the blob storage (we have an active incident to identify why this is happening).
TimeoutError: [Errno 110] Connection timed
Timeout lasts for approximately 15minutes and then while the codebase tries to recover a new exception happens from the storage SDK with the following message:
AuthenticationErrorDetail:Request date header too old: DATE OF TIMED OUT REQUEST
To Reproduce
Not sure how you could cause the 15minute timeout but perhaps the following can replicate results:
Expected behavior
The Request date header too old exceptions should not be thrown.
Stack trace
Fatal read error on socket transport
protocol: <asyncio.sslproto.SSLProtocol object at 0x7f1cf667a5c0>
transport: <_SelectorSocketTransport fd=121 read=polling write=<idle, bufsize=0>>
Traceback (most recent call last):
File "/home/azureuser/genfiles/external/python_runtime/python3/lib/python3.6/asyncio/selector_events.py", line 727, in _read_ready
data = self._sock.recv(self.max_size)
TimeoutError: [Errno 110] Connection timed out
ERROR 2020-09-08 16:08:45,961 customclass load_blob_file status=error, duration_ms=958266.8999999999
Traceback (most recent call last):
File "/home/azureuser/bin/azureuser/azure/storage/blob/aio/_download_async.py", line 271, in _initial_request
**self._request_options)
File "/home/azureuser/bin/azureuser/azure/storage/blob/_generated/aio/operations_async/_blob_operations_async.py", line 169, in download
raise models.StorageErrorException(response, self._deserialize)
azure.storage.blob._generated.models._models_py3.StorageErrorException: Operation returned an invalid status 'Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/azureuser/bin/azureuser/services/storage/blob_storage/blob_loader_storage.py", line 128, in load_blob_file
file = self._hot_storage.load_file(file_id)
File "/home/azureuser/bin/azureuser/services/storage/blob_storage/azure_blob_storage.py", line 60, in load_file
loop=self._get_or_create_event_loop(),
File "/home/azureuser/genfiles/external/python_runtime/python3/lib/python3.6/concurrent/futures/_base.py", line 432, in result
return self.__get_result()
File "/home/azureuser/genfiles/external/python_runtime/python3/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
File "/home/azureuser/bin/azureuser/services/storage/blob_storage/azure_blob_storage.py", line 79, in _load_blob_async_into_byte_stream
storage_stream_downloader = await blob_client.download_blob()
File "/home/azureuser/bin/azureuser/azure/core/tracing/decorator_async.py", line 74, in wrapper_use_tracer
return await func(*args, **kwargs)
File "/home/azureuser/bin/azureuser/azure/storage/blob/aio/_blob_client_async.py", line 335, in download_blob
await downloader._setup() # pylint: disable=protected-access
File "/home/azureuser/bin/azureuser/azure/storage/blob/aio/_download_async.py", line 225, in _setup
self._response = await self._initial_request()
File "/home/azureuser/bin/azureuser/azure/storage/blob/aio/_download_async.py", line 306, in _initial_request
process_storage_error(error)
File "/home/azureuser/bin/azureuser/azure/storage/blob/_shared/response_handlers.py", line 147, in process_storage_error
raise error
azure.core.exceptions.ClientAuthenticationError: Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.
RequestId:4a1422e8-101e-0039-72fa-85b01b000000
Time:2020-09-08T16:08:45.9594457Z
ErrorCode:AuthenticationFailed
Error:None
AuthenticationErrorDetail:Request date header too old: 'Tue, 08 Sep 2020 15:52:47 GMT'
ERROR 2020-09-08 16:08:45,963 customclass loader_fetch_batch_load_data status=error, duration_ms=958596.6
Traceback (most recent call last):
File "/home/azureuser/bin/azureuser/azure/storage/blob/aio/_download_async.py", line 271, in _initial_request
**self._request_options)
File "/home/azureuser/bin/azureuser/azure/storage/blob/_generated/aio/operations_async/_blob_operations_async.py", line 169, in download
raise models.StorageErrorException(response, self._deserialize)
azure.storage.blob._generated.models._models_py3.StorageErrorException: Operation returned an invalid status 'Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.'
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @xgithubtriage.
hi @andreasbotsikas
Thanks for reaching out.
For the connection timeout error, you can set connection_timeout=2 on BlobServiceClient/BlobClient/ContainerClient
let me know if this doesn't help
Thanks @xiafu-msft , can you please provide an example of how to specify the connection_timeout? I can't find any reference to that constructor property in ContainerClient/BlobClient or BlobServiceClient. I found a reference to socket_timeout in readthedocs, is this the one you are referring to?
Moreover, is the behavior of caching the original request date normal or could the SDK dispose the current context and try to authenticate with a new one?
Hi @andreasbotsikas
Take BlobClient as an example:
BlobClient(youraccounturl, credential=storage_account_key, container_name='foo', blob_name='bar', connection_timeout=2)
Meanwhile it's weird that you get Request date header too old error
can you check if you computer has the correct time? If it does, then can you try some other api like get_blob_properties() and see if it can succeed?
Thanks for your help
Thanks for the example! I guess we can similarly modify the read_timeout, based on the source. I noticed that connection_timeout's default value is 20 seconds and read_timeout's default value is 80000 sec. Our timeout happened after 10mins so is it fair to assume that we connected to the storage account and the timeout happened during read?
As for the error, I agree that this is a very odd one and it is not system related (the computer does have the correct time). The system is running 48 parallel processes and the rest of the threads are working fine. Only the one that got the 10minute timeout gets an follow up exception that the Request date header too old. This is exactly the reason why I thought of reporting this issue as it seems to me that something is being cached.
As for replication, unfortunately it is very random when it happens (I will forward you the ticket). Let me try to replicate that with fiddler and I will come back.
We experience the same ~15min timeout issue, when iterating for the next message on a queue with multiple pollers. It also happens randomly
https://github.com/Azure/azure-sdk-for-python/blob/0e82fae6b881d39497f8eb6890a33969dd9eec31/sdk/core/azure-core/azure/core/pipeline/transport/_requests_basic.py#L120
https://github.com/Azure/azure-sdk-for-python/blob/0e82fae6b881d39497f8eb6890a33969dd9eec31/sdk/core/azure-core/azure/core/pipeline/transport/_requests_basic.py#L137
This seems to be an oversight, time.sleep(1000) is 15 minutes wait.
I think it was mistakenly introduced by this PR https://github.com/Azure/azure-sdk-for-python/pull/5785#pullrequestreview-247892247 , thinking it was in milliseconds but it's actually seconds
I used mitmproxy to intercept the blob read requests and in my case it is retrying 4 times before throwing the ServiceRequest Error.
What I noticed is that it is sending a cached x-ms-date which could explain the Request date header too old error:

interesting.. nice work on the reproduction script.
Maybe it's doing a backoff retry, and during these retries it doesn't update the request date?
I'll wait for Azure support to answer.
I should add that in our case, there is no reason for 15 minute timeouts. It looks a bit like we're being throttled
I went a bit further in the investigation:
I put a logger in this place:
https://github.com/Azure/azure-sdk-for-python/blob/0e82fae6b881d39497f8eb6890a33969dd9eec31/sdk/core/azure-core/azure/core/pipeline/transport/_requests_basic.py#L258
to get the actual timeout that was used.
I tried with my code (on azure queue), and with your (blob storage). We both use DefaultAzureCredential
The same thing happens: 3 requests are sent in total, the first 2 requests to get a token (from http://169.254.169.254/metadata/identity/oauth2/token)
timeout (0.3, 300) to http://169.254.169.254/metadata/identity/oauth2/token`
timeout (604800, 300) to http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https://storage.azure.com
timeout (1,1) to "https://<service>.core.windows.net"
So the first 2 requests have a timeout of 300 seconds, and I'm not sure how to override that. The tokens are valid for an hour, so the client only sends these requests once an hour, but if it fails the timeout is very long.
It's because of the default retry policy which is ExponentialRetry.
So it retries 3 times, the backoff delay is (in average, there is some jitter):
15 + 18 + 24 = 57 seconds
Added to the 3 times * 300 seconds timeout, it makes 15 min 57 seconds.
Which is almost exactly the duration in the initial report:
AuthenticationErrorDetail:Request date header too old: 'Tue, 08 Sep 2020 15:52:47 GMT'
ERROR 2020-09-08 16:08:45,963 customclass loader_fetch_batch_load_data status=error, duration_ms=958596.6
(15 min 58 seconds)
And I observed this duration multiple times as well.
Hi @vbarbaresi
Thanks so much for the investigation and summary, you can disable retry policy by setting retry_total=0, and set connection_timeout or read_timeout according to your need.
eg. BlobClient(youraccounturl, credential=storage_account_key, container_name='foo', blob_name='bar', retry_total=0, connection_timeout=2)
more retry param could be found here https://github.com/Azure/azure-sdk-for-python/blob/master/sdk/storage/azure-storage-blob/README.md#retry-policy-configuration
Also we will discuss updating header during retry! Thanks again for the feedback!
Thanks, but the problem comes from the ManagedIdentityCredential client requests. This client requests http://169.254.169.254/metadata/identity/oauth2/token, it happens before requesting blob storage.
It has his own client with his own policies, and it's initialized before the BlobClient. It's initialized right when we create the Credential object
https://github.com/Azure/azure-sdk-for-python/blob/117a6f509d5995ed2c56aca3d134b093e019b577/sdk/identity/azure-identity/azure/identity/_authn_client.py#L187-L197
So I finally found a way to configure the timeout for the IMDS requests:
credential = DefaultAzureCredential(timeout=2, read_timeout=2)
this passes the timeout values to the ManagedIdentityCredential, which passes them to the AuthnClient. We could also pass retry policies but I'm fine with the default retry policy with a reasonable timeout
Hi @vbarbaresi
thanks for digging into that! The reason that ManagedIdentityCredential takes long to timeout is the token expired and it triggered retry.
We created an issue https://github.com/Azure/azure-sdk-for-python/issues/14701
which will fix your problem
Sorry about the inconvenience
Hey! a PR has just been merged to resolve this issue (https://github.com/Azure/azure-sdk-for-python/pull/14629).
This fix will be available in our next release happening soon :)
Thanks for reporting this and for your patience!