Vision: ImageNet link dead

Created on 8 Nov 2019  路  4Comments  路  Source: pytorch/vision

This is an issue similar to #1453, #151, and is supposed to be closed by #1457, but still broken (as of pytorch==1.3.0, vision==0.4.1)

dataset = ImageNet(root='~/Desktop/data/ImageNet/', split='train', download=True) 

throws an error

Downloading http://www.image-net.org/challenges/LSVRC/2012/nnoupb/ILSVRC2012_devkit_t12.tar.gz to /home/zafar/Desktop/data/ImageNet/ILSVRC2012_devkit_t12.tar.gz
                  ---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
<ipython-input-6-0bf187fa6b69> in <module>
----> 1 imgnet = ImageNet(root='~/Desktop/data/ImageNet/', split='train', download=True)

~/.pyenv/versions/3.6-dev/lib/python3.6/site-packages/torchvision/datasets/imagenet.py in __init__(self, root, split, download, **kwargs)
     53 
     54         if download:
---> 55             self.download()
     56         wnid_to_classes = self._load_meta_file()[0]
     57 

~/.pyenv/versions/3.6-dev/lib/python3.6/site-packages/torchvision/datasets/imagenet.py in download(self)
     73             download_and_extract_archive(archive_dict['url'], self.root,
     74                                          extract_root=tmp_dir,
---> 75                                          md5=archive_dict['md5'])
     76             devkit_folder = _splitexts(os.path.basename(archive_dict['url']))[0]
     77             meta = parse_devkit(os.path.join(tmp_dir, devkit_folder))

~/.pyenv/versions/3.6-dev/lib/python3.6/site-packages/torchvision/datasets/utils.py in download_and_extract_archive(url, download_root, extract_root, filename, md5, remove_finished)
    246         filename = os.path.basename(url)
    247 
--> 248     download_url(url, download_root, filename, md5)
    249 
    250     archive = os.path.join(download_root, filename)

~/.pyenv/versions/3.6-dev/lib/python3.6/site-packages/torchvision/datasets/utils.py in download_url(url, root, filename, md5)
     94                 )
     95             else:
---> 96                 raise e
     97 
     98 

~/.pyenv/versions/3.6-dev/lib/python3.6/site-packages/torchvision/datasets/utils.py in download_url(url, root, filename, md5)
     82             urllib.request.urlretrieve(
     83                 url, fpath,
---> 84                 reporthook=gen_bar_updater()
     85             )
     86         except (urllib.error.URLError, IOError) as e:

~/.pyenv/versions/3.6-dev/lib/python3.6/urllib/request.py in urlretrieve(url, filename, reporthook, data)
    246     url_type, path = splittype(url)
    247 
--> 248     with contextlib.closing(urlopen(url, data)) as fp:
    249         headers = fp.info()
    250 

~/.pyenv/versions/3.6-dev/lib/python3.6/urllib/request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context)
    221     else:
    222         opener = _opener
--> 223     return opener.open(url, data, timeout)
    224 
    225 def install_opener(opener):

~/.pyenv/versions/3.6-dev/lib/python3.6/urllib/request.py in open(self, fullurl, data, timeout)
    530         for processor in self.process_response.get(protocol, []):
    531             meth = getattr(processor, meth_name)
--> 532             response = meth(req, response)
    533 
    534         return response

~/.pyenv/versions/3.6-dev/lib/python3.6/urllib/request.py in http_response(self, request, response)
    640         if not (200 <= code < 300):
    641             response = self.parent.error(
--> 642                 'http', request, response, code, msg, hdrs)
    643 
    644         return response

~/.pyenv/versions/3.6-dev/lib/python3.6/urllib/request.py in error(self, proto, *args)
    568         if http_err:
    569             args = (dict, 'default', 'http_error_default') + orig_args
--> 570             return self._call_chain(*args)
    571 
    572 # XXX probably also want an abstract factory that knows when it makes

~/.pyenv/versions/3.6-dev/lib/python3.6/urllib/request.py in _call_chain(self, chain, kind, meth_name, *args)
    502         for handler in handlers:
    503             func = getattr(handler, meth_name)
--> 504             result = func(*args)
    505             if result is not None:
    506                 return result

~/.pyenv/versions/3.6-dev/lib/python3.6/urllib/request.py in http_error_default(self, req, fp, code, msg, hdrs)
    648 class HTTPDefaultErrorHandler(BaseHandler):
    649     def http_error_default(self, req, fp, code, msg, hdrs):
--> 650         raise HTTPError(req.full_url, code, msg, hdrs, fp)
    651 
    652 class HTTPRedirectHandler(BaseHandler):

HTTPError: HTTP Error 404: Not Found

The download link itself is dead:
image

Environment

$> python ~/Git/pytorch/torch/utils/collect_env.py 
Collecting environment information...
PyTorch version: 1.4.0.dev20191110
Is debug build: No
CUDA used to build PyTorch: 10.1

OS: Ubuntu 18.04.3 LTS
GCC version: (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
CMake version: version 3.10.2

Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: 9.0.176
GPU models and configuration: 
GPU 0: GeForce RTX 2080 Ti
GPU 1: GeForce GT 730

Nvidia driver version: 430.50
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.0

Versions of relevant libraries:
[pip3] numpy==1.17.3
[pip3] torch==1.3.1
[pip3] torchsummary==1.5.1
[pip3] torchtext==0.4.0
[pip3] torchvision==0.4.1
[pip3] torchviz==0.0.1
[conda] Could not collect
datasets

Most helpful comment

As far as I know, there were never _official_ public download links. They required the users to register with them and provided download links afterwards. Still, the archives were publicly accessible, which are now closed. You have to ask the ImageNet staff why they did this.

All 4 comments

This was indeed fixed by #1457 on the master branch:

https://github.com/pytorch/vision/blob/95131de394543a7c34bd51932bdfce21dae516c1/torchvision/datasets/imagenet.py#L40-L49

Unfortunately, this commit did not make it into the recent releases and this is why you still get the error. You have to ask @fmassa why this is the case.

Hi,

We have only cut a minor release for 0.4.2, which contained only the necessary for an improved video reading.

Next release of torchvision will have the fixes from @pmeier .

Why is imagenet so much harder to download now?

As far as I know, there were never _official_ public download links. They required the users to register with them and provided download links afterwards. Still, the archives were publicly accessible, which are now closed. You have to ask the ImageNet staff why they did this.

Was this page helpful?
0 / 5 - 0 ratings