gsutil -m -h "Cache-Control: public, max-age=31536000" cp -r test/** gs://some-bucket
Traceback (most recent call last):
File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gsutil", line 21, in <module>
gsutil.RunMain()
File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gsutil.py", line 124, in RunMain
sys.exit(gslib.__main__.main())
File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 424, in main
return _RunNamedCommandAndHandleExceptions(
File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 762, in _RunNamedCommandAndHandleExceptions
_HandleUnknownFailure(e)
File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 620, in _RunNamedCommandAndHandleExceptions
return command_runner.RunNamedCommand(command_name,
File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/command_runner.py", line 411, in RunNamedCommand
return_code = command_inst.RunCommand()
File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/commands/cp.py", line 1201, in RunCommand
self.Apply(_CopyFuncWrapper,
File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1499, in Apply
self._ParallelApply(
File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1719, in _ParallelApply
self._CreateNewConsumerPool(process_count, thread_count,
File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1384, in _CreateNewConsumerPool
p.start()
File "/usr/local/Cellar/[email protected]/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/usr/local/Cellar/[email protected]/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/usr/local/Cellar/[email protected]/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/context.py", line 283, in _Popen
return Popen(process_obj)
File "/usr/local/Cellar/[email protected]/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
super().__init__(process_obj)
File "/usr/local/Cellar/[email protected]/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
self._launch(process_obj)
File "/usr/local/Cellar/[email protected]/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/usr/local/Cellar/[email protected]/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
TypeError: cannot pickle '_io.TextIOWrapper' object
gsutil version: 4.47
I ran into the same issue. If you use another interpreter (python 3.7 for instance) all is well. This is a problem specifically with Python 3.8
Google Cloud SDK 281.0.0
beta 2019.05.17
bq 2.0.53
cloud-firestore-emulator 1.10.4
core 2020.02.14
gsutil 4.47
Bug is still there
Google Cloud SDK 283.0.0
alpha 2019.05.17
app-engine-python 1.9.88
beta 2019.05.17
bq 2.0.54
cloud-datastore-emulator 2.1.0
core 2020.02.28
gsutil 4.48
Still exists now, only on multiprocessing flag, runs fine without -m
:
Google Cloud SDK 286.0.0
bq 2.0.55
core 2020.03.24
gsutil 4.48
Tracking this down, this error comes from a change in Python 3.8 in the multiprocessing library:
Changed in version 3.8: On macOS, the spawn start method is now the default. The fork start method should be considered unsafe as it can lead to crashes of the subprocess. See bpo-33725.
Spawn is being run for those using MacOs and Python 3.8+ by default since nothing is explicitly set either through get_context
or set_start_method
.
The issue still presents with Cloud SDK 302.0.0 (gsutil 4.52), on macOS 10.15.6 with Python 3.8.5 installed from homebrew
One workaround is to use the Python 3 interpreter shipped with macOS /usr/bin/python3
by setting the Cloud SDK interpreter path https://cloud.google.com/sdk/gcloud/reference/topic/startup
gsutil does not work with python 3.8, force it to use python 3.7 with something like
export CLOUDSDK_PYTHON=/usr/bin/python3 # on mac
export CLOUDSDK_PYTHON=/usr/bin/python3.7 # on linux
@aleb I'm not sure if this is specific to Mac Mojave, but the path for python3 for me was /usr/local/bin/python3
. I couldn't get it to work with python3 anyways, but forcing it to use 2.7 worked like a charm.
export CLOUDSDK_PYTHON=/usr/local/bin/python3 # did not work
export CLOUDSDK_PYTHON=/usr/bin/python2.7 # worked
From the link @caizixian provided,
Python 3 is preferred over Python 2. Note that gcloud requires Python version 2.7.x or 3.5 and up. Other Python tools shipped in the Cloud SDK do not support Python 3 and require Python 2.7.x,
Another workaround on macOS is to
brew install [email protected]
export CLOUDSDK_PYTHON=/usr/local/opt/[email protected]/bin/python3
@dinvlad That worked for me! Thank you so much
@dinvlad Thank you! Works perfectly!
With such a strange "pickle" error, I didn't expect to find my resolution so quickly. Thank you, @dinvlad!!
export CLOUDSDK_PYTHON=/usr/bin/python2.7 will work ! export CLOUDSDK_PYTHON=/usr/bin/python3 or export CLOUDSDK_PYTHON=path/for/python3.7 will solve the current issue but will run into module 'sys' has no attribute 'maxint' error.
While I recognize comments like "Is tHiS fiXEd??" are not helpful — would it be possible for someone on the Google side to acknowledge this is a bug in gsutil and plan to resolve?
Currently, IIUC, gsutil breaks on python 3.8 — a version released a year ago, and the default brew version. Workarounds like installing another version of python are not small adjustments, and difficult for less technical colleagues. There are 49 :+1:s on the issue.
Sorry for the delay. We are aware of this bug and we are working on releasing this workaround soon https://github.com/GoogleCloudPlatform/gsutil/pull/1107
Another workaround would be to disable multiprocessing altogether when using Python 3.8. This can be done either by setting the parallel_process_count=1
in the boto
config file or by passing the option from the command line like this
gsutil -o "GSUtil:parallel_process_count=1" -m cp .....
This will be relatively slow as it will be using a single process, however, multithreading will be still ON.
That's an excellent workaround, thanks @dilipped !
updating gsutil solved the issue with python3.8
gsutil does not work with python 3.8, force it to use python 3.7 with something like
export CLOUDSDK_PYTHON=/usr/bin/python3 # on mac export CLOUDSDK_PYTHON=/usr/bin/python3.7 # on linux
It works fine to me
Most helpful comment
gsutil does not work with python 3.8, force it to use python 3.7 with something like