When running a gsutil rsync from a cloud storage bucket to a local directory, I got the following warning:
WARNING: gsutil rsync uses hashes when modification time is not available at
both the source and destination. Your crcmod installation isn't using the
module's C extension, so checksumming will run very slowly. If this is your
first rsync since updating gsutil, this rsync can take significantly longer than
usual. For help installing the extension, please see "gsutil help crcmod".
Note that the documentation in gsutil help crcmod indicates that macOS should include this by default, quoting:
gsutil distributes a pre-compiled version of crcmod for macOS, so you shouldn't
need to compile and install it yourself. If for some reason the pre-compiled
version is not being detected, please let the Google Cloud Storage team know
So I'm filing an issue as directed.
macOS 10.15.6
Cloud SDK 303.0.0 (core libraries 2020.07.24)
Also having this issue. FWIW, manually installing crcmod fixed this.
sudo pip3 install -U crcmod
Note I am using python3 per the documentation here: https://cloud.google.com/sdk/gcloud/reference/topic/startup by adding
export CLOUDSDK_PYTHON=python3
to my .zshrc
before installing the sdk.
this actually hasn't worked out, I now get this error for any cp
process on gsutil
Copying gs://<path>...
Traceback (most recent call last):
File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gsutil", line 21, in <module>
gsutil.RunMain()
File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gsutil.py", line 122, in RunMain
sys.exit(gslib.__main__.main())
File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 429, in main
return _RunNamedCommandAndHandleExceptions(
File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 767, in _RunNamedCommandAndHandleExceptions
_HandleUnknownFailure(e)
File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 625, in _RunNamedCommandAndHandleExceptions
return command_runner.RunNamedCommand(command_name,
File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/command_runner.py", line 411, in RunNamedCommand
return_code = command_inst.RunCommand()
File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/commands/cp.py", line 1190, in RunCommand
self.Apply(_CopyFuncWrapper,
File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1514, in Apply
self._SequentialApply(func, args_iterator, exception_handler, caller_id,
File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1586, in _SequentialApply
worker_thread.PerformTask(task, self)
File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/command.py", line 2306, in PerformTask
results = task.func(cls, task.args, thread_state=self.thread_gsutil_api)
File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/commands/cp.py", line 778, in _CopyFuncWrapper
cls.CopyFunc(args,
File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/commands/cp.py", line 982, in CopyFunc
_, bytes_transferred, result_url, md5 = copy_helper.PerformCopy(
File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/utils/copy_helper.py", line 3873, in PerformCopy
return _DownloadObjectToFile(src_url,
File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/utils/copy_helper.py", line 3054, in _DownloadObjectToFile
crc32c) = (_DoSlicedDownload(src_url,
File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/utils/copy_helper.py", line 2700, in _DoSlicedDownload
cp_results = command_obj.Apply(
File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1499, in Apply
self._ParallelApply(
File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1719, in _ParallelApply
self._CreateNewConsumerPool(process_count, thread_count,
File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1384, in _CreateNewConsumerPool
p.start()
File "/usr/local/Cellar/[email protected]/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/usr/local/Cellar/[email protected]/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/usr/local/Cellar/[email protected]/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File "/usr/local/Cellar/[email protected]/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
super().__init__(process_obj)
File "/usr/local/Cellar/[email protected]/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
self._launch(process_obj)
File "/usr/local/Cellar/[email protected]/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/usr/local/Cellar/[email protected]/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
TypeError: cannot pickle '_io.TextIOWrapper' object
@brianbrownton The _io.TextIOWrapper error is a separate issue on Mac + Python 3.8. https://github.com/GoogleCloudPlatform/gsutil/issues/961
I am guessing that you shouldn't see the same issue if you use Python 3.7
I have the same problem. Added env vars pointing to python3 as well, to no avail.
gsutil version: 4.57
checksum: 43b6eb5e813ffed48ec2e541025259cb (OK)
boto version: 2.49.0
python version: 3.9.1 (default, Dec 10 2020, 11:11:14) [Clang 12.0.0 (clang-1200.0.32.27)]
OS: Darwin 20.2.0
multiprocessing available: True
using cloud sdk: True
pass cloud sdk credentials to gsutil: True
config path(s): /Users/XXX/.config/gcloud/legacy_credentials/XXX/.boto
gsutil path: /Users/XXX/Code/google-cloud-sdk/bin/gsutil
compiled crcmod: False
installed via package manager: False
editable install: False
I've run sudo pip3 install -U crcmod
.
Any updates on this?
@mihar It might be possible that the crcmod is not getting installed for the correct python binary.
You can try doing this
# Get the python path that Cloud SDK is using
python_path=$(gcloud info | grep "Python Location" | sed 's/.*\[\(.*\)\]/\1/g' )
# Install crcmod for that python binary
$python_path -m pip install -U crcmod