Could you give more context or info about how to reproduce the bug, @efiop?
The context is that it seems like ssh remote with url and port configured raises an error on 0.19.7:


and that error reportedly doesn't raise on 0.18.7. This bug was reported by a user today, but I am not able to reproduce it yet.
mhm, I looked at remote/ssh.py and there's no usage of SSHConfig; maybe, the user was expecting some configuration to be pulled over the .ssh/config file (like keys).
maybe with something like:
config = paramiko.SSHConfig()
with open('~/.ssh/config') as ssh_config_file:
config.parse(ssh_config_file)
config.lookup('hostname')
However, I don't know if this is intentional, but it would be a cool feature; also, taking into account that you commit the .dvc/config file, I don't think it is a good idea to put a SSH password there.
The main point here is that the problem seemingly disappeared when user downgraded to 0.18.7, which is strange, since there were no changes affecting his simple url + port configuration since then. So I don't think that it is related to .ssh/config, since it would've affected 0.18.7 as well.
so, taking into account that you commit the .dvc/config file, I don't think it is a good idea to put a SSH password there.
We also have .dvc/config.local which is not tracked by git and stays private, so it is suitable for storing ssh keys if user wants to do it. We also have an ask_password option in the config, that will make dvc ask for a password before connecting to the remote.
Btw, great point about the .ssh/config! I was convinced that AutoPolicy accounts for it, but looking at it right now it seems like it doesn't :( Still doesn't explain why downgrading to 0.18.7 worked :( I guess I'll try to get more details from the user shortly.
Ok, asked user once again and it turns out that 0.18.7 no longer works as well :) Btw, there is no .ssh/config this time, so we are pretty safe there :) One suspicions thing is that id_rsa has a -----BEGIN OPENSSH PRIVATE KEY-----, which makes me think that there might be something wrong with the paramiko not able to parse it properly. Looking into it right now.
@efiop, :thinking: , check out this lines, paramiko seems to expect only RSA or DSA tags:
Just to be sure is not the key, is there anyway the user can run this command to check if the key is valid?
# In this case, `id_rsa` is the name of the private key
ssh-keygen -l -f ~/.ssh/id_rsa
basically, it will try to print the fingerprint, and if it's not valid, it will output something like:
~/.ssh/id_rsa is not a key file.
(Here's the read function, expecting a tag)
https://github.com/paramiko/paramiko/blob/53095107625a1303bd9fcfcc7c2c20b9819ee79f/paramiko/pkey.py#L282-L338
Is the user running on a non *unix-like system?
https://github.com/paramiko/paramiko/blob/53095107625a1303bd9fcfcc7c2c20b9819ee79f/sites/www/faq.rst#paramiko-doesnt-work-with-my-cisco-windows-or-other-non-unix-system
@mroutis Great point! I actually have some updates on this issue. Turns out the key is not an rsa key, but placed in the id_rsa file for some reason. I tried mixing that and dvc was still connecting without any problem. User re-generated keys on some older machine(they are using ubuntu 16.04 and debian 9) and got a proper RSA key, which made everything work again. Regardless, this issue makes me wonder if we should stop using paramiko in favor of ssh CLI command wrapper or rsync CLI command wrapper. The main reason for doing that is that we would at least be shielded from paramiko bugs and ssh ones would be easily reproducible by just asking user to run ssh example.com and seeing if that works.
On the other hand, updating paramiko(or fabric) is easier than updating a usually system-wide commands such as ssh or rsync, which might be a problem on some legacy systems. That being said, we are a cutting edge project, so that should not be a problem. Will look into it shortly.
Can we reproduce it after all? I would ask first paramiko to fix it. I would try to avoid depending on system environment (ssh command) that is outside DVC - it's might be even less predictable at the end.
@shcheklein I was not able to reproduce it yet. This is not the first issue where ssh command by itself is more accepting of file format than the paramiko itself(i.e. https://github.com/paramiko/paramiko/issues/1226, https://github.com/paramiko/paramiko/issues/1015) and I'm thinking if we could eliminate such problems altogether and have a better method of reproducing even potential issues by just switching to a much more used and reliable CLI commands. It doesn't get less predictable than python modules :) especially when you consider that we would use pretty basic functions of ssh or rsync, which have been stable for years. I haven't decided yet on which one I prefer the most, but these are the options that are worth considering.
Relying on the system environment / userland is delegating the responsibility to the user; that way, DVC can handle a more "Unix" approach to remotes, something like:
dvc config.remote.command "rsync --archive --verbose --compress --progress --rsh 'ssh -i ~/ssh/aws.pem' .dvc/cache myhost:/tmp/dvc/cache"
or:
dvc config.remote.command "aws s3 cp .dvc/cache s3://dvc-bucket/cache"
allowing them to use dvc push as an alias to upload the dvc's cach茅.
I guess that decision depends a lot on the interface you want to maintain and how friendly you want it.
In my opinion, supporting native uploading to s3, azure, gcp, etc. is a great help for non-so-techy users that just want to collaborate with others, even if that implies taking care of some dependencies issues
@mroutis I was actually thinking about using ssh or rsync commands internally, same way we do that in parts of dvc/remote/ssh.py and in the whole dvc/remote/hdfs.py. That being said your idea with configuring "remote command" is very interesting, since it might allow user to configure any way of transferring files that they would like without even getting into writing python modules. Yet, it would require a lot of commands to be configured(download file, upload file, check if file exists + commands for our remote dependency/output features) which makes it a lot less elegant and not suitable in the current architecture since at that point writing a python module no longer seems that hard :)
@efiop, can we close this one?
@mroutis Let's leave it open for now. I'll try to come back to this soon.
Hi @kuzeyron !
Tried to use ed25519 key again and it seems to work for me just fine. Could you please show your paramiko version? I.e. pip freeze | grep paramiko if you've installed dvc with pip.
Thanks,
Ruslan
@kuzeyron I am using 2.4.2 as well. ed25519 is not a commit, it is a type of key that produces -----BEGIN OPENSSH PRIVATE KEY----- header 馃檪 . Are you getting the same not a valid rsa error that is shown in the logs above? Could you confirm that that key works when you use plain ssh command?
@kuzeyron Just to be clear, are you using dvc at all? :slightly_smiling_face: Or is it a question about paramiko in your project? If it is, it would be more appropriate to ask it in paramiko's issues, especially since you are able to reproduce it(unlike me) and it would be beneficial for everyone.
Woops! Google pointed me to this page. Forgot to check the url. I read the posts instead. My bad!
I'll move this to Paramiko's page!
@kuzeyron No worries :slightly_smiling_face: It would be amazing if you could post a link to your issue here, so we could follow it too :slightly_smiling_face:
Sure I will do that :)
@efiop One hack I just tried that will do for now is to use openssl req -new -x509 -days 365 -nodes -keyout cert.pem
And rename first line to BEGIN RSA PRIVATE KEY
For the record: This is a paramiko-specific issue discussed at https://github.com/paramiko/paramiko/issues/1015 Closing.
Had this error message as well.
How was the key generated?
If done using ssh-keygen without specifying the output as a PEM file, Paramiko can raise issues. More info here:
https://github.com/paramiko/paramiko/issues/1313
Hope this helps!
Had this error message as well.
How was the key generated?
If done using ssh-keygen without specifying the output as a PEM file, Paramiko can raise issues. More info here:
https://github.com/paramiko/paramiko/issues/1313Hope this helps!
Link is dead. Sad that I'm still getting the same error on Fabric 2
@SharkFourSix Try https://github.com/paramiko/paramiko/issues/1313
@SharkFourSix Try paramiko/paramiko#1313
I tried -m PEM switch but still not working.
Fabric 2.4.0
Paramiko 2.4.2
Invoke 1.2.0
This sucks
[EDIT Removed irrelevant SSH version info]
@SharkFourSix could you please share the full DVC output when you run it with -v option? Also, how exactly do you run the ssh-keygen? could you share the commands please? All the information to reproduce this exactly would be useful - like your OS, dvc version, etc. Thanks!
Most helpful comment
Had this error message as well.
How was the key generated?
If done using ssh-keygen without specifying the output as a PEM file, Paramiko can raise issues. More info here:
https://github.com/paramiko/paramiko/issues/1313
Hope this helps!