Dvc: `dvc import` from `s3://...` fails when s3 provided by 3rd party provider.

Created on 30 Oct 2018  路  9Comments  路  Source: iterative/dvc

  • Standard aws client works, if endpoint url provided.

    • aws --endpoint-url https://\
  • DVC repository works, endpoint_url provided
['remote "minio"']
url = s3://<my-path>
endpointurl = <my-endpoint>

However dvc import fails because no capability to specify endpoint url
More details: https://gist.github.com/bwalsh/1afea0a2499b5e81c507dbc24521038e

All 9 comments

Hi @bwalsh !

You can actually use that remote in the dvc import command. E.g. dvc import remote://minio/path/to/file file(NOTE: path/to/file should be specified relative to the remote URL, e.g. if URL is s3://mybucket/mydir and you want file s3://mybucket/mydir/myfile then it should be remote://remote/myfile). Could you please try it out and see if it works for you?

Thanks,
Ruslan

@bwalsh if you want to use one bucket for remote cache and another bucket for remote dependencies, I believe, you will need to create a separate remote using dvc remote add - https://dvc.org/doc/commands-reference/remote#add and use it instead the minio one

@efiop @mroutis Just a thing to think. I don't have any good suggestions yet. It's a little bit confusing that we use the same notion of remotes for storing cache and to specify external dependencies. Term "remote" came from git where it has only one specific purpose - as a central place to store commits (cache).

@efiop @mroutis @shcheklein thanks everyone for the quick response.

I tried setting up the second remote, adding the endpoint url, and importing. I can confirm the import remote://... works, thanks.

Left to my own devices I wouldn't have guessed it was a possibility.
I wish aws would fix their cli config this issue w/ awscli has been open for a while. In the interim the remote:// solution works, but is semantically confusing.

@bwalsh Sorry for a dumb question, but I just want to clarify. Are you aware that the dvc import not only downloads the file, but also tracks the remote source of that file and re-downloads if it has changed?

@efiop : yes I was aware. The semantics of dvc import are clear, it's the semantics the of remote:// by import

@efiop, I'm going to label this as documentation to improve/clarify the remote:// "protocol"

Let's close this one, since we have same issue opened on dvc.org.

Was this page helpful?
0 / 5 - 0 ratings