I just tried copying a directory containing several files and directories using:
gsutil -m cp -r dir gs://my-bucket
It copied only the top-level files from dir to the bucket.
I'm using:
gsutil version: 4.27
checksum: 522455e2d24593ff3a2d3d237eefde57 (OK)
boto version: 2.47.0
python version: 2.7.6 (default, Oct 26 2016, 20:30:19) [GCC 4.8.4]
OS: Linux 4.4.0-83-generic
multiprocessing available: True
using cloud sdk: False
pass cloud sdk credentials to gsutil: False
config path(s): /usr/local/google/home/mfschwartz/.boto_prod_oauth
gsutil path: /usr/local/google/home/mfschwartz/gsutil/gsutil
compiled crcmod: True
installed via package manager: False
editable install: False
If all of those subdirectories satisfied either of these two conditions:
then that's working as intended. But if the subdirectories were non-symlink directories with regular files in them, that seems like a bug (although I'm not sure why this would happen).
Could you provide a file tree for which this is reproducible?
After digging I found out that this happened because the top-level directory I tried to copy had an invalid symlink, and the core problem is that gsutil gives up when it encounters this condition (so actually the problem is unrelated to subdirectories; it just happened in the case I originally reported that the symlink was encountered before the first subdirectory).
I'd point out that if you create a directory on Unix containing several files with an invalid symlink lexicographically earlier than some of the files and use the Unix cp command to try to copy them all, it will complain about the invalid symlink but finish copying the other files:
% mkdir repro
% touch repro/{1,3,4}
% ln -s /broken repro/2
% mkdir new
% cp repro/* new
cp: cannot stat ‘repro/2’: No such file or directory
% ls new
1 3 4
I think gsutil should similarly keep going after it encounters a broken symlink, given our guiding principle of making gsutil behave as similarly as possible to it Unix command ancestors.
I'm having a similar issue. gsutil is not copying some of my subdirectories. Specifying -c
doesn't appear to help (I'm using -m
anyway, so I'm not even sure specifying -c
is necessary). The subdirectories are not symlinks. cp just seems to stop when it reaches a bad file. Any workaround?
I should also mention if I point directly to one of the subdirectories to have it upload, it works. So I'm at a loss for what's going on.
Trying gsutil rsync -D
reveals this stack while syncing:
DEBUG: Exception stack trace:
Traceback (most recent call last):
File "/usr/lib/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 590, in _RunNamedCommandAndHandleExceptions
user_project=user_project)
File "/usr/lib/google-cloud-sdk/platform/gsutil/gslib/command_runner.py", line 372, in RunNamedCommand
return_code = command_inst.RunCommand()
File "/usr/lib/google-cloud-sdk/platform/gsutil/gslib/commands/rsync.py", line 1536, in RunCommand
diff_iterator = _DiffIterator(self, src_url, dst_url)
File "/usr/lib/google-cloud-sdk/platform/gsutil/gslib/commands/rsync.py", line 939, in __init__
raise CommandException('Caught non-retryable exception - aborting rsync')
CommandException: CommandException: Caught non-retryable exception - aborting rsync
CommandException: Caught non-retryable exception - aborting rsync
By specifying -U
and -e
I was able to work around this issue. I still do not know which file caused cp and rsync to error out but it seems like -c
should cause cp to continue despite the error. Some more useful feedback when this issue occurs would also help to identify the file causing the problem and help a user work around it.
Had the same issue with
gsutil cp -r gs://some loc_dir
after updated all gsutils to latest version.
Confirming, that prefix
gsutil cp -r -U -e -c
solved the issue
Me too - gsutil skips over files in directories under /src
gsutil -m cp -U -e -r /src gs://bucket/prefix/
gsutil version: 4.46
EDIT: gsutil cp / rsync will ignore any directories it does not have permission to enter without a warning or error message.
How to copy empty subdirectories.. gsutil is only copying subdirectories which have files and empty ones are ignore and not created in target bucket. Please help
@prabhat-diwaker gsutil skips empty directories, it is the intended behavior.
@prabhat-diwaker gsutil skips empty directories, it is the intended behavior.
Does gsutil have an option which would allow you to upload empty folders or, at least, folders which contain another folder but don't contain files?
Most helpful comment
Had the same issue with
after updated all gsutils to latest version.
Confirming, that prefix
solved the issue