Currently, the effect of assigning a connection string to an (Azure) remote is, that the .dvc/config file is updated with the actual connection string. This means that it will be exposed to Git, which is unlucky if you care about your data security.
I would suggest a fix where dvc automatically searches for the connection string in an environment variable (or a .env-file, or similar), e.g. DVC_REMOTE_{NAME}. This makes it more secure, and would also allow it to work with modern CI tools.
My 5 cents about implementation:
For user convenience, maybe we should override dvc remote modify myremote connection_string my-connection-string logic so that it creates some automatically ignored config file?
@pared Right, that would also make sense. And then CI systems would simply need to do run that command prior to running whatever commands are required. I think that would be a good solution.
@pared great point!
@casparjespersen Or you could simply use AZURE_STORAGE_CONNECTION_STRING env var, that is supported already :slightly_smiling_face:
@efiop
Good point, didn't know about that.
Even though, I would at least warn user about possible implications of using dvc remote modify {} connection_string for azure.
@efiop AZURE_STORAGE_CONNECTION_STRING is, as I understand it, supported due to how the Azure Storage SDK works. This also means that it does not support two separate remotes: it will affect all your Azure remotes, and also affect other software that may use Azure Storage SDK. Most use-cases would only use a single remote, I guess, but if DVC is supposed to work with multiple remotes, then this is not a viable solution :-)
@casparjespersen Great point! So we have a --local config .dvc/config.local that is added to the gitignore, so you don't track it with it and so won't expose it. This is pretty much what @pared suggested, but it is already implemented :) E.g.
dvc remote modify myremote connection_string my-connection-string --local
@efiop If that is already supported, then that is awesome. I guess the only steps to take then is to
--local (as @pared also suggested), and--local :-)@casparjespersen Moved this issue to dvc.org, we'll adjust our docs. Thanks for the feedback! :slightly_smiling_face:
Sure. It looks like a really interesting project, looking forward to trying it out!
@efiop ideally we should be detecting when someone is trying to use dvc remote modify to save something into the project's config. So, it looks like it's not only about docs.
@casparjespersen since you are familiar with Azure, would be great if you can edit the remote add, remote modify command references? You can just click the Edit on Github button and put the note online :)
@shcheklein We could do that, but I don't think it is worth the effort to set up notifications for particular remote parameters. It would only complicate the internal logic for matters that are common sense.
Most helpful comment
My 5 cents about implementation:
For user convenience, maybe we should override
dvc remote modify myremote connection_string my-connection-stringlogic so that it creates some automatically ignored config file?