Hello,
I recently installed Gitea on my personal server. I find it great to manage my git repositories but I was wondering if there was a way to host my DVC repositories as well. My idea is that, since Gitea allows cloning git repositories with SSH, I should be able to clone DVC repositories with SSH as well.
So I tried it with one project. I set a URL for DVC which is as the one for git, but changing the extension from .git
to .dvc
(ideally, I would like to use this convention). I also created the corresponding directory on the server. If I try to dvc push
, I always get this error:
ERROR: unexpected error - EOF during negotiation
------------------------------------------------------------
Traceback (most recent call last):
File "/home/gcoter/projects/music/unsupervised-source-separation/venv/lib/python3.6/site-packages/paramiko/sftp_client.py", line 130, in __init__
server_version = self._send_version()
File "/home/gcoter/projects/music/unsupervised-source-separation/venv/lib/python3.6/site-packages/paramiko/sftp.py", line 134, in _send_version
t, data = self._read_packet()
File "/home/gcoter/projects/music/unsupervised-source-separation/venv/lib/python3.6/site-packages/paramiko/sftp.py", line 201, in _read_packet
x = self._read_all(4)
File "/home/gcoter/projects/music/unsupervised-source-separation/venv/lib/python3.6/site-packages/paramiko/sftp.py", line 188, in _read_all
raise EOFError()
EOFError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/gcoter/projects/music/unsupervised-source-separation/venv/lib/python3.6/site-packages/dvc/main.py", line 49, in main
ret = cmd.run()
File "/home/gcoter/projects/music/unsupervised-source-separation/venv/lib/python3.6/site-packages/dvc/command/data_sync.py", line 49, in run
recursive=self.args.recursive,
File "/home/gcoter/projects/music/unsupervised-source-separation/venv/lib/python3.6/site-packages/dvc/repo/__init__.py", line 31, in wrapper
ret = f(repo, *args, **kwargs)
File "/home/gcoter/projects/music/unsupervised-source-separation/venv/lib/python3.6/site-packages/dvc/repo/push.py", line 25, in push
return self.cloud.push(used, jobs, remote=remote)
File "/home/gcoter/projects/music/unsupervised-source-separation/venv/lib/python3.6/site-packages/dvc/data_cloud.py", line 81, in push
show_checksums=show_checksums,
File "/home/gcoter/projects/music/unsupervised-source-separation/venv/lib/python3.6/site-packages/dvc/remote/local.py", line 385, in push
download=False,
File "/home/gcoter/projects/music/unsupervised-source-separation/venv/lib/python3.6/site-packages/dvc/remote/local.py", line 358, in _process
download=download,
File "/home/gcoter/projects/music/unsupervised-source-separation/venv/lib/python3.6/site-packages/dvc/remote/local.py", line 279, in status
md5s, jobs=jobs, name=str(remote.path_info)
File "/home/gcoter/projects/music/unsupervised-source-separation/venv/lib/python3.6/site-packages/dvc/remote/ssh/__init__.py", line 329, in cache_exists
ret = list(itertools.compress(checksums, in_remote))
File "/usr/lib/python3.6/concurrent/futures/_base.py", line 586, in result_iterator
yield fs.pop().result()
File "/usr/lib/python3.6/concurrent/futures/_base.py", line 432, in result
return self.__get_result()
File "/usr/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
File "/usr/lib/python3.6/concurrent/futures/thread.py", line 56, in run
result = self.fn(*self.args, **self.kwargs)
File "/home/gcoter/projects/music/unsupervised-source-separation/venv/lib/python3.6/site-packages/dvc/remote/ssh/__init__.py", line 322, in exists_with_progress
return self.batch_exists(chunks, callback=pbar.update_desc)
File "/home/gcoter/projects/music/unsupervised-source-separation/venv/lib/python3.6/site-packages/dvc/remote/ssh/__init__.py", line 290, in batch_exists
channels = ssh.open_max_sftp_channels()
File "/home/gcoter/projects/music/unsupervised-source-separation/venv/lib/python3.6/site-packages/dvc/remote/ssh/connection.py", line 301, in open_max_sftp_channels
self._sftp_channels.append(self._ssh.open_sftp())
File "/home/gcoter/projects/music/unsupervised-source-separation/venv/lib/python3.6/site-packages/paramiko/client.py", line 556, in open_sftp
return self._transport.open_sftp_client()
File "/home/gcoter/projects/music/unsupervised-source-separation/venv/lib/python3.6/site-packages/paramiko/transport.py", line 1097, in open_sftp_client
return SFTPClient.from_transport(self)
File "/home/gcoter/projects/music/unsupervised-source-separation/venv/lib/python3.6/site-packages/paramiko/sftp_client.py", line 170, in from_transport
return cls(chan)
File "/home/gcoter/projects/music/unsupervised-source-separation/venv/lib/python3.6/site-packages/paramiko/sftp_client.py", line 132, in __init__
raise SSHException("EOF during negotiation")
paramiko.ssh_exception.SSHException: EOF during negotiation
------------------------------------------------------------
Of course, I know that Gitea was not developed with DVC in mind, maybe my idea is impossible. But I would like to be sure. Could you give me advice about what I should check on the Gitea server to make sure that DVC can access it through SSH?
@gcoter, you added the SSH url of repo that's in gitea as dvc remotes, right?
I'm afraid that's not supported as the SSH url is only meant for authentication in Gitea (same with github/gitlab for the matter). There's no way to ssh directly (and, it's meant to host git repositories).
DVC requires both SSH and SFTP access for SSH remotes to work. You can however git push
to push to the remote gitea server, and add one of the remote storages supported by dvc. Then, you can share files tracked by dvc via dvc push
.
I set a URL for DVC which is as the one for git, but changing the extension from .git to .dvc
You mean, you changed url similar to [email protected]:iterative/dvc.git
to [email protected]:iterative/dvc.dvc
? Anyway, hosting DVC repository is not supported by Gitea. However, as mentioned above, you can host git repository there and add a different remote storage on the DVC.
I recently installed Gitea on my personal server. I find it great to manage my git repositories but I
was wondering if there was a way to host my DVC repositories as well. My idea is that, since Gitea
allows cloning git repositories with SSH, I should be able to clone DVC repositories with SSH as well.
You can setup SSH server on the same one as Gitea and use as a SSH remote storage if you prefer. :slightly_smiling_face:
Hi @skshetry, thank you for your quick response! Indeed, my idea was that, if I have a git URL like [email protected]/gcoter/project.git
, I would store the corresponding dvc repository at [email protected]/gcoter/project.dvc
. I thought that the SSH service running on Gitea was agnostic to what is inside the requested directory (acting like SFTP) and that I could just use kind of the same URL, as long as I store my dvc in the same directory as the one used by Gitea.
So, if I understand well, SSH is only used for authentication, not as a SFTP server, right? In this case, I understand that it cannot work.
You can setup SSH server on the same one as Gitea and use as a SSH remote storage if you prefer
Yes I think I will do that. I will create a SFTP server which will be responsible for dvc repositories and store them next to the git repositories. That way, I can let Gitea work as it was intended :slightly_smiling_face:
@gcoter, just a friendly ping! :slightly_smiling_face:
Sorry for the late reply, though. Got lost.
Yes I think I will do that. I will create a SFTP server which will be responsible for dvc repositories and store them next to the git repositories. That way, I can let Gitea work as it was intended
Did you try? Did it work?
If you need any help, please do comment or ask us on Discord.
So, if I understand well, SSH is only used for authentication, not as a SFTP server, right? In this case, I understand that it cannot work.
Yes, it's used by Gitea to authenticate you so that you can push/pull from the repository. But, I don't know specific details though. Sorry.
Hi @skshetry, I didn't have the time to finish it but I did start to try several docker images (because it is a requirement for me to use docker) which would allow me to create a SFTP server sharing the same volume as Gitea. I will tell you when I am done :)
I finally managed to make it work by using this docker image: https://github.com/atmoz/sftp
Here is how I did it:
1) I cloned the project and built the image locally on the Raspberry Pi (because the image on Docker Hub was not compatible with ARM). To avoid confusion, I tagged it "rpi".
2) I added this to the docker-compose file I used for Gitea:
dvc:
image: atmoz/sftp:rpi
ports:
- "<A port different from the one used by Gitea for SSH>:22"
volumes:
- /home/gcoter/.ssh/id_ed25519:/etc/ssh/ssh_host_ed25519_key:ro
- /home/gcoter/.ssh/id_rsa:/etc/ssh/ssh_host_rsa_key:ro
- dvc_data:/home/dvc
command: dvc::1001
3) I opened a bash session inside the SFTP container and created the folder /home/dvc/gcoter
(to mirror the convention used in Gitea). So now, all of the added dvc repositories will be stored in the volume dvc_data
in the folder gcoter
. I also added my public keys to /home/dvc/.ssh/authorized_keys
with the right permissions because I want to authenticate with keys from my personal computer (no password).
4) Because Gitea and this SFTP are on the same host, I can use almost the same URL in my projects. For instance, ssh://git@<URL>:<Gitea SSH Port>/gcoter/project.git
become ssh://dvc@<URL>/gcoter/project.dvc
with the SFTP port.
I finally managed to make it work by using this docker image: https://github.com/atmoz/sftp
@gcoter, glad to hear that. And, thanks for sharing the setup information, this will benefit a lot of users for sure (including me :stuck_out_tongue:).
Most helpful comment
@gcoter, glad to hear that. And, thanks for sharing the setup information, this will benefit a lot of users for sure (including me :stuck_out_tongue:).