w+import tempfile
from google.cloud import storage
client = storage.Client()
bucket = client.get_bucket('my-bucket')
blob = bucket.blob('test_ascii')
fd = tempfile.TemporaryFile('w+')
fd.write('\u0090')
fd.seek(0)
blob.upload_from_file(fd)
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/google/cloud/storage/blob.py", line 1085, in upload_from_file
client, file_obj, content_type, size, num_retries, predefined_acl
File "/usr/local/lib/python3.6/dist-packages/google/cloud/storage/blob.py", line 995, in _do_upload
client, stream, content_type, size, num_retries, predefined_acl
File "/usr/local/lib/python3.6/dist-packages/google/cloud/storage/blob.py", line 942, in _do_resumable_upload
response = upload.transmit_next_chunk(transport)
File "/usr/local/lib/python3.6/dist-packages/google/resumable_media/requests/upload.py", line 396, in transmit_next_chunk
self._process_response(result, len(payload))
File "/usr/local/lib/python3.6/dist-packages/google/resumable_media/_upload.py", line 574, in _process_response
self._get_status_code, callback=self._make_invalid)
File "/usr/local/lib/python3.6/dist-packages/google/resumable_media/_helpers.py", line 93, in require_status_code
status_code, u'Expected one of', *status_codes)
google.resumable_media.common.InvalidResponse: ('Request failed with status code', 400, 'Expected one of', <HTTPStatus.OK: 200>, 308)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.6/dist-packages/google/cloud/storage/blob.py", line 1089, in upload_from_file
_raise_from_invalid_response(exc)
File "/usr/local/lib/python3.6/dist-packages/google/cloud/storage/blob.py", line 1960, in _raise_from_invalid_response
raise exceptions.from_http_status(response.status_code, message, response=response)
google.api_core.exceptions.BadRequest: 400 PUT https://www.googleapis.com/upload/storage/v1/b/my-bucket/o?uploadType=resumable&upload_id=AEnB2Uo0YkAqrWxqv4zVpm7bsO1mbUCGNIjxQPrQa4OV5HPad6kQatXYUF0UWVc8rWTMGEoYRIKH-QBUGmd35-u6FLRw04c4-A: ('Request failed with status code', 400, 'Expected one of', <HTTPStatus.OK: 200>, 308)
And when printing some extra information with the custom exception I have the following information
b'Invalid request. There were 3 byte(s) in the request body. There should have been 6 byte(s) (starting at offset 0 and ending at offset 5) according to the Content-Range header.'
When I open the file in binary mode, and then I encode the string it's working, but in my mind it was not necessary on linux?
Thanks!
@Alexis-Jacob The short answer to your question is that Blob objects always want bytes: Blob.upload_from_string does, as a convenience, encode text values to UTF-8, but Blob.upload_from_file doesn't have any way to check the mode of an already-opened file. So, either open your file in binary mode and write bytes to it, or else use NamedTemporaryFile and read from the name, e.g.:
import tempfile
from google.cloud import storage
client = storage.Client()
bucket = client.get_bucket('my-bucket')
blob = bucket.blob('test_ascii')
with tempfile.NamedTemporaryFile('w+') as fd:
fd.write('\u0090')
fd.flush()
blob.upload_from_filename(fd.name)
@tseaver maybe the documentation could be updated to explain this requirement?
Also, maybe Blob.upload_from_file could check if 'b' in file_obj.mode, couldn't it ? I believe that early error/warning would be better than unexpected error later.
The scenario does work with more usual non-ascii unicode code-points such as U+00E9 (茅) (UTF8: C3A9); this delays the moment when bad code is found in production.
Most helpful comment
@Alexis-Jacob The short answer to your question is that
Blobobjects always want bytes:Blob.upload_from_stringdoes, as a convenience, encode text values to UTF-8, butBlob.upload_from_filedoesn't have any way to check the mode of an already-opened file. So, either open your file in binary mode and write bytes to it, or else useNamedTemporaryFileand read from the name, e.g.: