gsutil rsync doesn't recognize some directories as suitable for transfer

Created on 10 Aug 2017  路  15Comments  路  Source: GoogleCloudPlatform/gsutil

While trying to move directories recursively to GCS, I've run into this error on almost every directory:

CommandException: arg (./guix-backup) does not name a directory, bucket, or bucket subdir.

I'm guessing the tool detects these as files instead of directories for some reason. Here's output from two commands. One that works and one that doesn't.

build-staging ~ # ls -lsa /var/log
total 244
  8 drwxr-xr-x.  4 root root              4096 Jun 13 08:49 .
  8 drwxr-xr-x. 11 root root              4096 Aug  2 16:48 ..
  4 -rw-------.  1 root utmp                 0 Jun 13 08:49 btmp
  4 -rw-r--r--.  1 root root                 0 Jun 13 08:49 faillog
  8 drwxr-sr-x.  4 root systemd-journal   4096 Jun 13 08:49 journal
  8 -rw-r--r--.  1 root root            146292 Aug 10 18:46 lastlog
  8 drwx------.  2 root root              4096 Jun  7 23:19 sssd
196 -rw-rw-r--.  1 root utmp            190080 Aug 10 18:46 wtmp
build-staging ~ #  gsutil rsync -r /var/log gs://guix/logs
Building synchronization state...
Starting synchronization
build-staging ~ # ls -lsa /gnu
total 2884
   8 drwxr-xr-x.    5 root root      4096 Aug  2 16:09 .
   4 drwxr-xr-x.   19 root root      4096 Jul 12 08:50 ..
   0 drwxrwxr-t. 1304 root hamann       0 Aug  9 21:44 store
 488 drwxrwxr-t. 1053 root hamann  491520 Aug  2 14:25 store-old
2384 drwxr-xr-x.   35 root root   2433024 Jul 31 18:28 store2
build-staging ~ #  gsutil rsync -r /gnu/store2 gs://guix/store-test
CommandException: arg (/gnu/store2) does not name a directory, bucket, or bucket subdir.

Most helpful comment

Sounds sane :)

A major use case for using gsutil on CoreOS would be downloading some files (onto the host file system) needed to launch a container, IIUC. Given this, it doesn't seem like gsutil should default to running within a container and not having access to the host file system being referenced in the original arguments to gsutil. I've passed along this feedback to our GCE team.

All 15 comments

If you open up a Python interpreter and run os.path.isdir('/gnu/store2'), does it return True or False? The only thing I can think that might cause this off the top of my head is IsDirectory() (defined in gslib/storage_url.py) returning False.

>>> os.path.isdir('/gnu/store2')
True

Mhmm. I assume that if we follow that up the stack to copy_helper.ExpandUrlToSingleBlr, then these two lines are relevant:

if storage_url.IsFileUrl():
  return (storage_url, storage_url.IsDirectory())

(unless IsFileUrl() returns false, in which case something is borked).
...and finally to storage_url.py's IsDirectory() method. The things that can cause that to fail are either we think the file is a stream, we think it's a fifo, or we don't think it's a directory. You could throw a breakpoint in (via import pdb; pdb.set_trace()) before the return statement in _FileUrl's IsDirectory() method and see if either self.IsStream() or self.IsFifo() return True for some reason (I assume IsDirectory() will be True, given the output you commented above)

  • IsStream() should only be True if you passed - in as an argument
  • IsFifo() returns True if this statement is truthy: stat.S_ISFIFO(os.stat(path).st_mode)

IsFifo() is false here as well. So I'm confused. I'm using gsutil 4.26 by the way.

Does this also happen with a fresh installation of v4.27? If so, can you paste the output of gsutil version -l? This should display some info about your Python version, OS you're using, checksum, and gsutil path (you may want to redact personal information in any local FS paths that might be in the output).

Without knowing what OS you're using, I do find it odd that this directory doesn't show up as taking up 4KiB (I assume this is your FS's block size, since . and .. take up 4KiB), or slightly more in the edge case that it might have contained lots of files at one time (this rarely occurs -- usually even something like /var/log is only a bit above block size, say 12K, but I don't think I've ever seen a directory file take up over 2 MiB).

This is on CoreOS. I'll try running gsutil inside a Docker container to compare. I'll also try with an install of 4.27.

@jsierles Did you ever figure this out? Someone else recently notified me of this same thing happening on a CoreOS system.

Nope - haven't revisited it lately, but will try again this week.

Finally got around to hunting this down; I think I've figured it out. Looks like GCE VMs have a _nifty_ alias set up for gsutil:

$ type gsutil
gsutil is aliased to `(docker images google/cloud-sdk || docker pull google/cloud-sdk) > /dev/null;docker run -t -i --net=host -v /home/<USER>/.config:/root/.config google/cloud-sdk gsutil'

In the above invocation, gsutil will run in a docker container, meaning it won't have access to the same file system hierarchy as the host system. So, it really isn't lying when it says "arg (./guix-backup) does not name a directory, bucket, or bucket subdir." -- that directory really doesn't exist within the container :)

To suggest a workaround: On my CoreOS instance, I cloned the gsutil repo, installed Python via some instructions I found here, and ran the local copy of gsutil rather than running it in a docker container -- that worked. You may want to give that a shot.

Thanks! This is easy enough to fix by creating another alias that mounts your home directory into the container. Maybe easier and more CoreOS-friendly than installing Python directly on the host.

Sounds sane :)

A major use case for using gsutil on CoreOS would be downloading some files (onto the host file system) needed to launch a container, IIUC. Given this, it doesn't seem like gsutil should default to running within a container and not having access to the host file system being referenced in the original arguments to gsutil. I've passed along this feedback to our GCE team.

For documentation's sake:
The issue being tracked with the GCE team is https://issuetracker.google.com/issues/70082703 (although visibility is limited to Google engineers only).

Was this page helpful?
0 / 5 - 0 ratings

Related issues

chris-crucible picture chris-crucible  路  12Comments

bobschultz picture bobschultz  路  23Comments

almirb picture almirb  路  16Comments

hhh151671 picture hhh151671  路  15Comments

anuj-kumar picture anuj-kumar  路  24Comments