Vision: download_url function does not accept proxy

Created on 18 Jun 2019  路  6Comments  路  Source: pytorch/vision

https://github.com/pytorch/vision/blob/fefe118ca89480c35b233d931a969fc92f568c4b/torchvision/datasets/utils.py#L57

In Corp network, the gpu servers are usually behind the firewall, which requires the the server to access outside of the world via the corp proxy. It will be great if this function can provide an option to download the data via proxy.

I tried to use the following but it did not work.
export http_proxy=proxy.server.corp.com:port
export https_proxy=proxy.server.corp.com:port

Thanks

datasets needs discussion

Most helpful comment

Isn't that something that the user can do in their script, before (or even after) importing torchvision?

For example, from https://stackoverflow.com/questions/22967084/urllib-request-urlretrieve-with-proxy we could maybe do something like

from six.moves import urllib

proxy = urllib.request.ProxyHandler({'http': '127.0.0.1'})
# construct a new opener using your proxy settings
opener = urllib.request.build_opener(proxy)
# install the openen on the module-level
urllib.request.install_opener(opener)

# now import torchvision
import torchvision

dataset = torchvision.datasets.MNIST('.', download=True)

All 6 comments

Isn't that something that the user can do in their script, before (or even after) importing torchvision?

For example, from https://stackoverflow.com/questions/22967084/urllib-request-urlretrieve-with-proxy we could maybe do something like

from six.moves import urllib

proxy = urllib.request.ProxyHandler({'http': '127.0.0.1'})
# construct a new opener using your proxy settings
opener = urllib.request.build_opener(proxy)
# install the openen on the module-level
urllib.request.install_opener(opener)

# now import torchvision
import torchvision

dataset = torchvision.datasets.MNIST('.', download=True)

I'm also sitting behind a proxy and its working fine for me. I also have HTTP_PROXY and HTTPS_PROXY set, since some services only check for one notation. I don't know if that is the case for six.moves.urllib.

If that doesn't help, I agree with @fmassa.

I believe this can be addressed by the user, either following @pmeier comment or my previous comment.

As such, I'm closing this issue, but let me know if this doesn't work for your cases (but I think it should)

I'm using cntlm for passing download on proxy env.
I tried above settings to urllib, but torchvision raised HTTPError following.

/usr/local/lib/python3.7/urllib/request.py in http_error_default(self, req, fp, code, msg, hdrs)
    647 class HTTPDefaultErrorHandler(BaseHandler):
    648     def http_error_default(self, req, fp, code, msg, hdrs):
--> 649         raise HTTPError(req.full_url, code, msg, hdrs, fp)
    650
    651 class HTTPRedirectHandler(BaseHandler):

HTTPError: HTTP Error 407: Proxy Authentication Required

My script is following.

from six.moves import urllib

proxy = urllib.request.ProxyHandler({'http': 'my_cntlm_server:my_cntlm_port'})
opener = urllib.request.build_opener(proxy)
urllib.request.install_opener(opener)

import torchvision
dataset = torchvision.datasets.MNIST('.', download=True)

Could you help me?

Citing from here:

The HTTP 407 Proxy Authentication Required client error status response code indicates that the request has not been applied because it lacks valid authentication credentials for a proxy server that is between the browser and the server that can access the requested resource.

This error is thrown before you even reach the download server. Thus, the proxy server you are using probably requires credentials, i. e. username and password. I suppose you need to use a ProxyBasicAuthHandler instead of a plain ProxyHandler.

Thanks @pmeier for the help!

Was this page helpful?
0 / 5 - 0 ratings