Problem while download files with special characters.
DVC version: 1.7.9
---------------------------------
Platform: Python 3.8.3 on Linux-5.4.0-48-generic-x86_64-with-glibc2.10
Supports: gdrive, gs, http, https, ssh
Cache types: hardlink, symlink
Cache directory: ext4 on /dev/sda5
Workspace directory: ext4 on /dev/sda5
Repo: dvc, git
Error log :
2020-09-28 15:53:35,511 ERROR: failed to import ssh://[email protected]/home/data/cana/ds30. You could also try downloading it manually, and adding it with `dvc add`. - ssh command 'md5sum /home/data/cana/ds30/cana-mucuna/class35_e2545053-f2c5-4108-9042-67244a94e267_p_['cana']_o_['cana', 'mucuna'].jpg' finished with non-zero return code 1': md5sum: '/home/data/cana/ds30/cana-mucuna/class35_e2545053-f2c5-4108-9042-67244a94e267_p_[cana]_o_[cana,': No such file or directory
md5sum: mucuna].jpg: No such file or directory
------------------------------------------------------------
Traceback (most recent call last):
File "/home/gabriel/anaconda3/lib/python3.8/site-packages/dvc/command/imp_url.py", line 14, in run
self.repo.imp_url(
File "/home/gabriel/anaconda3/lib/python3.8/site-packages/dvc/repo/__init__.py", line 51, in wrapper
return f(repo, *args, **kwargs)
File "/home/gabriel/anaconda3/lib/python3.8/site-packages/dvc/repo/scm_context.py", line 4, in run
result = method(repo, *args, **kw)
File "/home/gabriel/anaconda3/lib/python3.8/site-packages/dvc/repo/imp_url.py", line 54, in imp_url
stage.run()
File "/home/gabriel/anaconda3/lib/python3.8/site-packages/funcy/decorators.py", line 39, in wrapper
return deco(call, *dargs, **dkwargs)
File "/home/gabriel/anaconda3/lib/python3.8/site-packages/dvc/stage/decorators.py", line 36, in rwlocked
return call()
File "/home/gabriel/anaconda3/lib/python3.8/site-packages/funcy/decorators.py", line 60, in __call__
return self._func(*self._args, **self._kwargs)
File "/home/gabriel/anaconda3/lib/python3.8/site-packages/dvc/stage/__init__.py", line 429, in run
sync_import(self, dry, force)
File "/home/gabriel/anaconda3/lib/python3.8/site-packages/dvc/stage/imports.py", line 29, in sync_import
stage.save_deps()
File "/home/gabriel/anaconda3/lib/python3.8/site-packages/dvc/stage/__init__.py", line 392, in save_deps
dep.save()
File "/home/gabriel/anaconda3/lib/python3.8/site-packages/dvc/output/base.py", line 268, in save
self.hash_info = self.get_hash()
File "/home/gabriel/anaconda3/lib/python3.8/site-packages/dvc/output/base.py", line 178, in get_hash
return self.tree.get_hash(self.path_info)
File "/home/gabriel/anaconda3/lib/python3.8/site-packages/dvc/tree/base.py", line 263, in get_hash
hash_info = self.get_dir_hash(path_info, **kwargs)
File "/home/gabriel/anaconda3/lib/python3.8/site-packages/dvc/tree/base.py", line 330, in get_dir_hash
dir_info = self._collect_dir(path_info, **kwargs)
File "/home/gabriel/anaconda3/lib/python3.8/site-packages/dvc/tree/base.py", line 310, in _collect_dir
new_hashes = self._calculate_hashes(not_in_state)
File "/home/gabriel/anaconda3/lib/python3.8/site-packages/dvc/tree/base.py", line 296, in _calculate_hashes
return dict(zip(file_infos, hashes))
File "/home/gabriel/anaconda3/lib/python3.8/site-packages/dvc/tree/base.py", line 295, in <genexpr>
hashes = (hi.value for hi in executor.map(worker, file_infos))
File "/home/gabriel/anaconda3/lib/python3.8/concurrent/futures/_base.py", line 611, in result_iterator
yield fs.pop().result()
File "/home/gabriel/anaconda3/lib/python3.8/concurrent/futures/_base.py", line 439, in result
return self.__get_result()
File "/home/gabriel/anaconda3/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
raise self._exception
File "/home/gabriel/anaconda3/lib/python3.8/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/home/gabriel/anaconda3/lib/python3.8/site-packages/dvc/progress.py", line 126, in wrapped
res = fn(*args, **kwargs)
File "/home/gabriel/anaconda3/lib/python3.8/site-packages/dvc/tree/ssh/__init__.py", line 242, in get_file_hash
return HashInfo(self.PARAM_CHECKSUM, ssh.md5(path_info.path))
File "/home/gabriel/anaconda3/lib/python3.8/site-packages/dvc/tree/ssh/connection.py", line 295, in md5
md5 = self.execute("md5sum " + path).split()[0]
File "/home/gabriel/anaconda3/lib/python3.8/site-packages/dvc/tree/ssh/connection.py", line 276, in execute
raise RemoteCmdError("ssh", cmd, ret, err)
dvc.tree.base.RemoteCmdError: ssh command 'md5sum /home/data/cana/ds30/cana-mucuna/class35_e2545053-f2c5-4108-9042-67244a94e267_p_['cana']_o_['cana', 'mucuna'].jpg' finished with non-zero return code 1': md5sum: '/home/data/cana/ds30/cana-mucuna/class35_e2545053-f2c5-4108-9042-67244a94e267_p_[cana]_o_[cana,': No such file or directory
md5sum: mucuna].jpg: No such file or directory
------------------------------------------------------------
2020-09-28 15:53:35,520 DEBUG: Analytics is enabled.
2020-09-28 15:53:35,605 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/tmp/tmp4x4p60hi']'
2020-09-28 15:53:35,608 DEBUG: Spawned '['daemon', '-q', 'analytics', '/tmp/tmp4x4p60hi']'
Probable cause: the path to the file is /home/data/cana/ds30/cana-mucuna/class35_e2545053-f2c5-4108-9042-67244a94e267_p_['cana']_o_['cana', 'mucuna'].jpg
(includes combinations of special charactes like [
, '
, ]
, ,
, and ) which the file system supports via terminal as well as
ssh
and scp
, but paramiko
doesn't support it. See https://github.com/paramiko/paramiko/issues/583
@jorgeorpinel looks like paramiko/paramiko#583 is about exec_command
is it still relevant in this case?
(I'm asking mostly to see if we need to create a ticket on the paramiko side in advance if we are sure this paramiko's issue- it takes time to resolve them)
It's a good question. I'm not sure if that issue is exactly appropriate. I guess we could open a new one and see what the maintainers say. I commented on that one for now: https://github.com/paramiko/paramiko/issues/583#issuecomment-700967071
Yep, thanks @jorgeorpinel. It looks like related indeed. I'm still curious though what is the right solution - expect Paramiko to escape things, or expect Paramiko to be a think layer that doesn't alter what you pass into it - and it's our responsibility to escape the command, path, etc.
@jorgeorpinel That one is not related to this issue. We simply didn't escape the command ourselves, paramiko shouldn't take care of that for us, as commands are crafted by us. Will be fixed by https://github.com/iterative/dvc/pull/4767
paramiko shouldn't take care of that for us, as commands are crafted by us
But some times there are limitations in out dependencies that are too difficult to address, I think e.g. https://github.com/iterative/dvc/issues/4392#issuecomment-674448191 which is fine, I think.
Anyway, glad this is resolved 鉁岋笍
Most helpful comment
It's a good question. I'm not sure if that issue is exactly appropriate. I guess we could open a new one and see what the maintainers say. I commented on that one for now: https://github.com/paramiko/paramiko/issues/583#issuecomment-700967071