Spacy: ReadTimeoutError during model download

Created on 29 Jun 2017  路  5Comments  路  Source: explosion/spaCy

When trying to download the model, got ReadTimeoutError:

python -m spacy download fr

Traceback:

Collecting https://github.com/explosion/spacy-models/releases/download/fr_depvec_web_lg-1.0.0/fr_depvec_web_lg-1.0.0.tar.gz
  Downloading https://github.com/explosion/spacy-models/releases/download/fr_depvec_web_lg-1.0.0/fr_depvec_web_lg-1.0.0.tar.gz (1424.8MB)
Exception:
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/pip/basecommand.py", line 215, in main
    status = self.run(options, args)
  File "/usr/local/lib/python2.7/dist-packages/pip/commands/install.py", line 324, in run
    requirement_set.prepare_files(finder)
  File "/usr/local/lib/python2.7/dist-packages/pip/req/req_set.py", line 380, in prepare_files
    ignore_dependencies=self.ignore_dependencies))
  File "/usr/local/lib/python2.7/dist-packages/pip/req/req_set.py", line 620, in _prepare_file
    session=self.session, hashes=hashes)
  File "/usr/local/lib/python2.7/dist-packages/pip/download.py", line 821, in unpack_url
    hashes=hashes
  File "/usr/local/lib/python2.7/dist-packages/pip/download.py", line 659, in unpack_http_url
    hashes)
  File "/usr/local/lib/python2.7/dist-packages/pip/download.py", line 882, in _download_http_url
    _download_url(resp, link, content_file, hashes)
  File "/usr/local/lib/python2.7/dist-packages/pip/download.py", line 605, in _download_url
    consume(downloaded_chunks)
  File "/usr/local/lib/python2.7/dist-packages/pip/utils/__init__.py", line 852, in consume
    deque(iterator, maxlen=0)
  File "/usr/local/lib/python2.7/dist-packages/pip/download.py", line 571, in written_chunks
    for chunk in chunks:
  File "/usr/local/lib/python2.7/dist-packages/pip/utils/ui.py", line 139, in iter
    for x in it:
  File "/usr/local/lib/python2.7/dist-packages/pip/download.py", line 560, in resp_read
    decode_content=False):
  File "/usr/local/lib/python2.7/dist-packages/pip/_vendor/requests/packages/urllib3/response.py", line 357, in stream
    data = self.read(amt=amt, decode_content=decode_content)
  File "/usr/local/lib/python2.7/dist-packages/pip/_vendor/requests/packages/urllib3/response.py", line 324, in read
    flush_decoder = True
  File "/usr/lib/python2.7/contextlib.py", line 35, in __exit__
    self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python2.7/dist-packages/pip/_vendor/requests/packages/urllib3/response.py", line 246, in _error_catcher
    raise ReadTimeoutError(self._pool, None, 'Read timed out.')
ReadTimeoutError: HTTPSConnectionPool(host='github-production-release-asset-2e65be.s3.amazonaws.com', port=443): Read timed out.

    Downloading fr_depvec_web_lg-1.0.0/fr_depvec_web_lg-1.0.0.tar.gz
Traceback (most recent call last):
  File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main

    "__main__", fname, loader, pkg_name)

  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/usr/local/lib/python2.7/dist-packages/spacy/__main__.py", line 133, in <module>
    plac.Interpreter.call(CLI)
  File "/usr/local/lib/python2.7/dist-packages/plac_ext.py", line 1142, in call
    print(out)
  File "/usr/local/lib/python2.7/dist-packages/plac_ext.py", line 914, in __exit__
    self.close(exctype, exc, tb)
  File "/usr/local/lib/python2.7/dist-packages/plac_ext.py", line 952, in close
    self._interpreter.throw(exctype, exc, tb)
  File "/usr/local/lib/python2.7/dist-packages/plac_ext.py", line 964, in _make_interpreter
    arglist = yield task
  File "/usr/local/lib/python2.7/dist-packages/plac_ext.py", line 1139, in call
    raise_(task.etype, task.exc, task.tb)
  File "/usr/local/lib/python2.7/dist-packages/plac_ext.py", line 380, in _wrap
    for value in genobj:
  File "/usr/local/lib/python2.7/dist-packages/plac_ext.py", line 95, in gen_exc
    raise_(etype, exc, tb)
  File "/usr/local/lib/python2.7/dist-packages/plac_ext.py", line 966, in _make_interpreter
    cmd, result = self.parser.consume(arglist)
  File "/usr/local/lib/python2.7/dist-packages/plac_core.py", line 207, in consume
    return cmd, self.func(*(args + varargs + extraopts), **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/spacy/__main__.py", line 33, in download
    cli_download(model, direct)
  File "/usr/local/lib/python2.7/dist-packages/spacy/cli/download.py", line 24, in download
    link_package(model_name, model, force=True)
  File "/usr/local/lib/python2.7/dist-packages/spacy/cli/link.py", line 22, in link_package
    pkg = importlib.import_module(package_name)
  File "/usr/lib/python2.7/importlib/__init__.py", line 37, in import_module

    __import__(name)
ImportError: 
No module named fr_depvec_web_lg

The command '/bin/sh -c python -m spacy download fr' returned a non-zero code: 1
enhancement install

Most helpful comment

I've tried to run python3 -m spacy download en_core_web_md maybe 40 times now and it keeps failing due to the timeout. What is needed to pass in a longer timeout option?

All 5 comments

Maybe adding something like --timeout=10000 to the pip install command in the download CLI would help?

Thanks for the suggestion! We were actually thinking about just adding an option, for example via environment variables, to add custom flags and modifications to the pip install command. (This would also cover other use cases, e.g. when users have to specify a custom location.)

In the meantime, you can always just download the model manually via your browser (or however else you like) from here: https://github.com/explosion/spacy-models/releases/tag/fr_depvec_web_lg-1.0.0 And then simply install it via pip with a local path:

pip install fr_depvec_web_lg-1.0.0.tar.gz

In spaCy v2, this will also be a little less problematic, because the models will be much smaller (~15MB for the default English model).

In case you haven't seen it yet, new French models are now available for the new version, spaCy v2.0: https://spacy.io/models/fr (including a 37MB fr_core_news_sm model).

For details on possible improvements to the download command, see #1456. Suggestions and contributions welcome!

I've tried to run python3 -m spacy download en_core_web_md maybe 40 times now and it keeps failing due to the timeout. What is needed to pass in a longer timeout option?

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

tonywangcn picture tonywangcn  路  3Comments

nadachaabani1 picture nadachaabani1  路  3Comments

melanietosik picture melanietosik  路  3Comments

ajayrfhp picture ajayrfhp  路  3Comments

smartinsightsfromdata picture smartinsightsfromdata  路  3Comments