dvc push for http remote storage

Created on 28 Jan 2020  路  8Comments  路  Source: iterative/dvc

I wanted to upload and download data files over http. But, i came to know dvc only support dvc fetch and dvc pull for http remotes. Why is it so? How to make dvc push work for http remote? New to dvc, please help.

鈿狅笍 HTTP remotes only support downloads operations

awaiting response feature request good first issue help wanted

Most helpful comment

It would be really nice if this feature is available.

All 8 comments

Hi @harold1505 !

Are you using http to access AWS s3 or something else? Or do you want to use pure http remote?

It is neither AWS S3 nor a popular platform. It is a platform with Rest API in which files can be retrieved and restored via http. So I guess, I want a pure http remote.

@harold1505 Got it :slightly_smiling_face: So the reason why it is not supported is simple: we just didn't get any requests for supporting it until now. :slightly_smiling_face: To implement it, we will need to implement _upload method in dvc/remote/http.py. There might be some edge cases that we need to handle carefully, but overall the implementation should be pretty straightforward. If you feel like it, maybe even consider taking a shot and contributing a PR for it, we will be happy to help with everything we can. :slightly_smiling_face:

It would be really nice if this feature is available.

Seconded! I think there is some nuance in HTTP posts larger than 2 GB, but there is the Content-Range header, which allows you to split up the transfer into chunks, and I think a few other techniques. But it would be great to be able to write a POST endpoint to roll your own DVC remote backend.

As discussed with Ivan, I'll be working on implementing this

Just to clarify, the desired behavior for this is to have basic push support via making HTTP POST requests to http://<remote_url>/<filename>, and not push via some other HTTP based protocol like WebDAV (see also #1153)?

@harold1505 can you give some details on what exactly the platform you are using expects in terms of PUT/POST requests for file uploads?

The main reason I'm asking is that as far as I understand it, git's HTTP remote is read-only (like dvc) for "dumb" web servers. HTTP remote write/push is only supported for web servers that can talk git's "smart" protocol or WebDAV.

Just to clarify, the desired behavior for this is to have basic push support via making HTTP POST requests to http:///, and not push via some other HTTP based protocol like WebDAV (see also #1153)?

@pmrowla Great question! Correct, http remote should support POST requests. HTTP-based protocols (like s3 btw), are handled separately in their own remotes, so if we will ever need to add WebDAV support, we will likely create a separate remote class for it.

The main reason I'm asking is that as far as I understand it, git's HTTP remote is read-only (like dvc) for "dumb" web servers. HTTP remote write/push is only supported for web servers that can talk git's "smart" protocol or WebDAV.

I suppose you are trying to clarify if @harold1505 is trying to push to git HTTP server, but I just wanted to clarify that what we need to do here is support just generic HTTP servers that allow uploading through standard POST requests, without any special protocols on top.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

dmpetrov picture dmpetrov  路  35Comments

ChrisHowlin picture ChrisHowlin  路  35Comments

yukw777 picture yukw777  路  45Comments

danfischetti picture danfischetti  路  41Comments

Casyfill picture Casyfill  路  56Comments