Hi
I get a permission error when trying to push to an ssh remote.
I assume it is due to the fact, that I don't have full permission on the remote (only from /cluster/data/user onwards)
As far as I can see dvc tries to create directories from the root.
https://github.com/iterative/dvc/blob/be012e70cf9b16ec73b0fbeb0f2308558c2a10dd/dvc/remote/ssh/connection.py#L96-L110
Is this the intention or do i get something wrong?
I tried using password and keyfile.
Config:
['remote "cluster"']
url = ssh://user@host/cluster/data/user
user = user
keyfile = path-to-keyfile
Info:
dvc version 0.60.0
macos
pip installed
Hi @ynop !
Please post full -v
log, it makes it much easier for us to debug this. Also, are you able to ssh into your machine as user
and create /cluster/data/user/test
directory with mkdir /cluster/data/user/test
?
Yes creating manually works.
ERROR: failed to upload '.dvc/cache/70/889565c1fed05122fd06ad9492c9d3' to 'ssh://[email protected]/cluster/data/user/70/889565c1fed05122fd06ad9492c9d3' - [Errno 13] Permission denied
------------------------------------------------------------
Traceback (most recent call last):
File "/Users/matthi/zhaw/mt/data/.venv/lib/python3.7/site-packages/dvc/remote/base.py", line 522, in upload
no_progress_bar=no_progress_bar,
File "/Users/matthi/zhaw/mt/data/.venv/lib/python3.7/site-packages/dvc/remote/ssh/__init__.py", line 244, in _upload
no_progress_bar=no_progress_bar,
File "/Users/matthi/zhaw/mt/data/.venv/lib/python3.7/site-packages/dvc/remote/ssh/connection.py", line 189, in upload
self.makedirs(posixpath.dirname(dest))
File "/Users/matthi/zhaw/mt/data/.venv/lib/python3.7/site-packages/dvc/remote/ssh/connection.py", line 100, in makedirs
self.makedirs(head)
File "/Users/matthi/zhaw/mt/data/.venv/lib/python3.7/site-packages/dvc/remote/ssh/connection.py", line 100, in makedirs
self.makedirs(head)
File "/Users/matthi/zhaw/mt/data/.venv/lib/python3.7/site-packages/dvc/remote/ssh/connection.py", line 100, in makedirs
self.makedirs(head)
[Previous line repeated 2 more times]
File "/Users/matthi/zhaw/mt/data/.venv/lib/python3.7/site-packages/dvc/remote/ssh/connection.py", line 104, in makedirs
self.sftp.mkdir(path)
File "/Users/matthi/zhaw/mt/data/.venv/lib/python3.7/site-packages/paramiko/sftp_client.py", line 460, in mkdir
self._request(CMD_MKDIR, path, attr)
File "/Users/matthi/zhaw/mt/data/.venv/lib/python3.7/site-packages/paramiko/sftp_client.py", line 813, in _request
return self._read_response(num)
File "/Users/matthi/zhaw/mt/data/.venv/lib/python3.7/site-packages/paramiko/sftp_client.py", line 865, in _read_response
self._convert_status(msg)
File "/Users/matthi/zhaw/mt/data/.venv/lib/python3.7/site-packages/paramiko/sftp_client.py", line 896, in _convert_status
raise IOError(errno.EACCES, text)
PermissionError: [Errno 13] Permission denied
------------------------------------------------------------
This error appears multiple times.
I can post the full log later.
@ynop Btw, could you try downgrading to 0.59.2 to see if that one is affected too? We did add a few changes to those lines in 0.60.0, so it might be the cause.
I have tried 0.59.2, but ends up with the same errors.
Here is the log (using 0.60.0).
https://gist.github.com/ynop/a8273bf13b76f4b0ff4d1225197c96c6
@ynop , are you using the same user in your test and with DVC?
@ynop , I'll try to replicate it
@ynop , I wasn't able to reproduce it:
sudo systemctl start sshd
sudo mkdir -p /cluster/data/${USER}
sudo chown ${USER} -R /cluster/data/${USER}
ls -lah /cluster/data/
#
# Permissions Size User Group Date Modified Name
# drwxr-xr-x - mroutis root 25 Sep 19:54 mroutis
ssh localhost mkdir /cluster/data/${USER}/test
ls -lah /cluster/data/${USER}
#
# Permissions Size User Group Date Modified Name
# drwxr-xr-x - mroutis mroutis 25 Sep 20:05 test
dvc init --no-scm
dvc remote add ssh ssh://${USER}@localhost/cluster/data/${USER}
echo "foo" > foo
dvc add foo
dvc push -r ssh
ls -R -lah /cluster/data/mroutis
# Permissions Size User Group Date Modified Name
# drwxr-xr-x - mroutis mroutis 25 Sep 21:11 d3
# drwxr-xr-x - mroutis mroutis 25 Sep 20:05 test
#
# /cluster/data/mroutis/d3:
# Permissions Size User Group Date Modified Name
# .rw-r--r-- 4 mroutis mroutis 25 Sep 21:11 b07384d113edec49eaa6238ad5ff00
dvc version
#
# DVC version: 0.60.0
# Python version: 3.7.4
# Platform: Linux-5.3.1-arch1-1-ARCH-x86_64-with-arch
# Binary: False
# Cache: reflink - True, hardlink - True, symlink - True
# Filesystem type (cache directory): ('xfs', '/dev/mapper/vg-root')
# Filesystem type (workspace): ('xfs', '/dev/mapper/vg-root')
Looking at your log, it looks like is trying to create /cluster/data/user
dest
should be /cluster/data/user/5a/9ab1413269d550586b4586714e3ffc
So, after 3 head, tail = posixpath.split(path)
operations, the head
is /cluster/data/user
.
...
self.makedirs(posixpath.dirname(dest))
File "/Users/useraw/mt/data/.venv/lib/python3.7/site-packages/dvc/remote/ssh/connection.py", line 100, in makedirs
self.makedirs(head)
File "/Users/useraw/mt/data/.venv/lib/python3.7/site-packages/dvc/remote/ssh/connection.py", line 100, in makedirs
self.makedirs(head)
File "/Users/useraw/mt/data/.venv/lib/python3.7/site-packages/dvc/remote/ssh/connection.py", line 100, in makedirs
self.makedirs(head)
...
Can you make sure the /cluster/data/user
directory exists?
As far as I can see dvc tries to create directories from the root.
@ynop , it is not from the root, it is from the first existing directory:
https://github.com/iterative/dvc/blob/be012e70cf9b16ec73b0fbeb0f2308558c2a10dd/dvc/remote/ssh/connection.py#L85-L90
import posixpath
head, tail = posixpath.split('/cluster/data/user/5a/9ab1413269d550586b4586714e3ffc')
print(head) # '/cluster/data/user/5a'
print(tail) # '9ab1413269d550586b4586714e3ffc'
If i a add log outputs like that:
if stat.S_ISDIR(st_mode):
logger.debug('Directory')
return
else:
logger.debug('Not a directory')
Then i get an output like that:
DEBUG: Not a directory
DEBUG: /cluster/data/user
DEBUG: Not a directory
DEBUG: /cluster/data/user
DEBUG: Not a directory
DEBUG: /cluster/data/user
DEBUG: Not a directory
DEBUG: /cluster/data/user
DEBUG: Not a directory
DEBUG: /cluster/data/user
DEBUG: Not a directory
DEBUG: /cluster/data
DEBUG: Not a directory
DEBUG: /cluster/data
DEBUG: Not a directory
DEBUG: /cluster/data
DEBUG: Not a directory
DEBUG: /cluster/data
DEBUG: Not a directory
DEBUG: /cluster
DEBUG: Not a directory
DEBUG: /cluster
DEBUG: Not a directory
DEBUG: /cluster
DEBUG: Not a directory
DEBUG: /cluster
DEBUG: Not a directory
DEBUG: /
DEBUG: Directory
DEBUG: /
DEBUG: Directory
DEBUG: /
DEBUG: Directory
DEBUG: /
DEBUG: Directory
And the directory /cluster/data/user does exist.
Although /cluster is mounted from a ceph storage.
Could that be the problem?
I logged the file mode that i get:
logger.debug(stat.filemode(st_mode))
DEBUG: /cluster
DEBUG: ?---------
Seems that it can't retrieve the file mode for the mounted directories.
@ynop Indeed, that is the cause. Very interesting. Does running stat /cluster
on that machine show anything interesting?
File: /cluster
Size: 5 Blocks: 0 IO Block: 65536 directory
Device: 0h/0d Inode: 1099511661972 Links: 1
Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2019-09-26 08:37:11.922091000 +0200
Modify: 2019-04-25 06:21:22.901076000 +0200
Change: 2019-04-25 06:21:22.901076000 +0200
Birth: -
@ynop The blocks
is a bit weird to see as 0
, but at least stat
is able to tell that it is a directory. Very interesting. Maybe sftp's implementation of stat
a bit different. Or paramiko
's even. Let me look into those.
@ynop Btw, could you show full st_mode
please as seen in dvc itself?
@ynop Btw, one more workaround that comes to mind is to create something like ~/dvc-remote
symlink that would point to /cluster/data/user
and use that as a remote to see if that would help by any chance. Could you give it a try, please?
When I log st_mode
it just outputs 0
.
The workaround is not really possible, since all places I have access to are on the mounted volumes.
@ynop Btw, and what does stat -L /cluster
show? We are using lstat
in the dvc itself, and I wonder if it is causing that.
Exactly the same output.
@ynop Did I get you right, that you we're either using pdb or modifying dvc code in place and running it? If so, would you mind showing self.sftp.stat(path).st_mode
output?
@ynop I've taken a look at paramiko's and openssh's stat function, and they both seem pretty normal and just passing st_mode intact from original stat() call on the server. I wonder if CLI utility stat
is doing any tricks with such mounts. Could you check python's stat().st_mode
output on the server for /cluser
, please?
On the server via python it seems to work just fine.
>>> os.lstat('/cluster').st_mode
16877
>>> os.stat('/cluster').st_mode
16877
>>> stat.S_ISDIR(os.stat('/cluster').st_mode)
True
>>> stat.S_ISDIR(os.lstat('/cluster').st_mode)
True
When printing self.sftp.stat(path).st_mode
, I don't get any output.
with ignore_file_not_found():
logger.debug(self.sftp.stat(path).st_mode)
@ynop maybe it is because it is raising an exception there. How about self.sftp.stat("/cluster").st_mode
?
Then I get FileNotFoundError: [Errno 2] No such file
and therefore no output either.
@ynop That is weird. I think there is a miscommunication here. Btw, how about we get a video call together, so I could take a closer look at it, so we don't spend so much time going back and forward in the comments for this issuse? :slightly_smiling_face:
Unfortunately thats not possible today, as I am in a classroom.
@ynop Sure, we can do it tomorrow or whenever you'll have time, if we don't figure this out remotely here. :slightly_smiling_face:
Ok, so
Then I get FileNotFoundError: [Errno 2] No such file and therefore no output either.
you ran in makedirs when it failed, right? Or in some other way? Are you modifying dvc package in-place? Maybe consider adding import pdb; pdb.set_trace()
and using -j1
flag with dvc push
, so you could enter interactive pdb shell and experiment there.
I'm modifying the code in-place in the venv.
I run the command within the makedirs method.
I tried with pdb, then I get the FileNotFoundError
, when executing the step other_st_mode = self.sftp.stat(path).st_mode
.
@ynop
I tried with pdb, then I get the FileNotFoundError, when executing the step other_st_mode = self.sftp.stat(path).st_mode.
But path
might be pointing to something that actually doesn't exist, that is why I was asking about using /cluster
as path
there.
Ah sorry, overlooked that.
But same results with /cluster
as path.
@ynop FineNotFoundError
when running that command with /cluster
? O_o That is extremely odd. Are you sure you are connecting to the correct machine? If not, that would explain every single error we've seen in this thread.
Yeah, I also checked that by copying the url from the log and scp'd something there.
Hmm just found a solution, although i don't know why it is like that.
I debugged available paths using self.sftp.listdir(...)
starting from /
.
Then I see that I have the path /data/user
, but on the server or with scp or whatever it is /cluster/data/user
.
When I change the path in the dvc config, it works.
@ynop Oh, that makes total sense! So your sftp server on that machine is configured with the root of /clusetr
and not usual /
.
@ynop I imagine you don't have access to /etc/ssh/sshd_config on the server, right? But even if you don't., this explains everything perfectly.
Yeah, seems that way. Thanks for the help and sorry for the circumstances.
@ynop Nothing to be sorry for. This was extremely useful and will help debugging similar issues in the future! :slightly_smiling_face: Thank you!
@ynop So in the end, this was an sftp server configuration caveat, that dvc can't really do much to mitigate. But we can and should improve the error message to something like Unable to create directory '{}'
, so it is more friendly. Let's keep this issue opened until that one is fixed.
@efiop , @ynop, great discussion! We encountered that problem before :sweat_smile:
@efiop , I was thinking that maybe we can do sftp.getcwd()
and then compare it to the path specified in the URL.
Also, adding the error message to makedirs
will make it more easy to debug :+1:
@mroutis
Not sure what you want to compare it with, sftp.getcwd()
will return /
in /cluster
because it is configured to have the root there.
@efiop , true, forgot that chrooting changes also the PWD and friends to reflect the new _jailness_.