As described in the documentation, I run the following command to install spaCy:
conda install -c conda-forge spacy
However, if I call:
python -m spacy download en
It fails with the following error message:
/opt/conda/bin/python: No module named spacy.__main__; 'spacy' is a package and cannot be directly executed
Thanks for the report. I'm pretty sure this is related to #2462 – for some reason in some situations, users who install via conda end up with an old version of spaCy. Like, a really really old one, the one that the official Anaconda distribution used to ship and that didn't have CLI commands yet.
If you still have your install logs, could you check if there's some hint at that? Maybe conda didn't actually install the latest version from conda-forge, because your local Anaconda distribution already had spaCy included (but an old version)? And does conda update -c conda-forge spacy make a difference?
@DSamuylov, not sure if you solved this, but the solution @BBCMax alluded to in his second comment of the thread @ines refers to worked for me. I'm also on Ubuntu 18.04 and py3.6 via anaconda. Although now I'm having this problem, though that seems solvable. :-)
@drussellmrichie Do you still have your install logs and could you check if there's anything in them that looks like an old version of spaCy was used? Like, which command did you run and what exactly was conda doing there that resulted in you ending up with an old version?
I'd love to get to the bottom of this and see if there's anything we can do to prevent this. This is such a weird problem and unfortunately, it's mostly been hitting new users which is extra frustrating.
Btw, about the download timeout problem: If you want full control over how it's downloaded, I'd suggest finding the direct link to the archive file via the model releases. The model details also list the file size, so you can decide how you want to download it and what works best for you.
@ines, happy to help if I can. Not sure if I have install logs (how would I find those?), but I just did history | grep conda at the command line, and it seems I did the following:
20 bash Anaconda3-5.2.0-Linux-x86_64.sh
27 conda list
38 anaconda
43 export PATH=~/anaconda3/bin:$PATH
44 conda
45 conda list
48 conda --version
55 conda
62 bash Anaconda3-5.2.0-Linux-x86_64.sh
86 conda list
87 conda install spacy
I think that last line must have been how I got the old version of spacy on my machine (which I only got a few weeks ago).
@drussellmrichie Thanks so much! (And not sure if conda saves the logs – I was mostly thinking about that detailed table it prints that tells you what it's going to install and update, but that might only be available in the scrollback.)
So in this case, we can rule out an old Anaconda distribution then. My other theory was that maybe you have some other package installed (maybe a dependency of something else) that conflicts with one of spaCy's dependencies. So instead of updating all of the other dependencies, conda then just serves you an old version of spaCy that's compatible with the constraints.
Do you remember which packages you installed explicitly since getting the new machine?
Through some more greping through my terminal history, it seems that, after installing spacy, I installed tqdm and gensim through conda, and textblob-de and msgpack through pip.
@drussellmrichie Thanks! (And sorry for asking you all these random questions, but I really want to get to the bottom of this issue and I feel like we're very close here.)
Do you remember if you installed anything before you installed spaCy?
No worries about the questions. This is the absolute least I can do after using spaCy for free. :-)
Here's what I installed through the command-line (history | grep install) before spaCy:
3 sudo apt install nvidia-381
6 sudo ubuntu-drivers autoinstall
12 sudo ubuntu-drivers autoinstall
15 sudo ubuntu-drivers autoinstall
29 sup apt install vim
30 sudo apt install vim
36 sudo apt install google-chrome-stable
41 sudo apt install gnome-usage
59 sudo apt-get install fluxgui
I also installed pycharm, atom, dropbox, and skype, I think through some installers (not through commandline), but not sure if I did those before or after installing spacy.
@drussellmrichie Thanks! There's nothing here that looks problematic. Basically, I'm trying to figure out if there could have been something in your Anaconda environment before you installed spaCy that caused conda to download and install an old version of spaCy (e.g. to ensure compatibility).
A similar problem also just came up in #2542. The user didn't have access to the install log anymore, but shared the update log. Copying it over here for further investigation:
Solving environment: done
## Package Plan ##
environment location: /root/anaconda3
added / updated specs:
- spacy==2.0.11
The following packages will be downloaded:
package | build
---------------------------|-----------------
cytoolz-0.8.2 | py36_0 1.1 MB conda-forge
preshed-1.0.0 | py36hfc679d8_0 226 KB conda-forge
termcolor-1.1.0 | py_2 6 KB conda-forge
thinc-6.10.1 | py36hd61447b_0 1.5 MB
pathlib-1.0.1 | py_1 15 KB conda-forge
libgcc-7.2.0 | h69d50b8_2 304 KB
regex-2017.11.09 | py36_0 670 KB conda-forge
ca-certificates-2018.4.16 | 0 139 KB conda-forge
openssl-1.0.2o | 0 3.5 MB conda-forge
murmurhash-0.28.0 | py36hfc679d8_0 39 KB conda-forge
tqdm-4.23.4 | py_0 35 KB conda-forge
msgpack-numpy-0.4.3 | py_0 10 KB conda-forge
spacy-2.0.11 | py36_0 41.9 MB conda-forge
dill-0.2.8.2 | py36_0 109 KB conda-forge
------------------------------------------------------------
Total: 49.6 MB
The following NEW packages will be INSTALLED:
dill: 0.2.8.2-py36_0 conda-forge
libgcc: 7.2.0-h69d50b8_2
msgpack-numpy: 0.4.3-py_0 conda-forge
pathlib: 1.0.1-py_1 conda-forge
regex: 2017.11.09-py36_0 conda-forge
termcolor: 1.1.0-py_2 conda-forge
tqdm: 4.23.4-py_0 conda-forge
The following packages will be REMOVED:
anaconda: 5.2.0-py36_3
The following packages will be UPDATED:
ca-certificates: 2018.03.07-0 --> 2018.4.16-0 conda-forge
murmurhash: 0.26.4-py36_0 conda-forge --> 0.28.0-py36hfc679d8_0 conda-forge
openssl: 1.0.2o-h20670df_0 --> 1.0.2o-0 conda-forge
preshed: 0.46.4-py36_0 conda-forge --> 1.0.0-py36hfc679d8_0 conda-forge
spacy: 0.101.0-py36_0 --> 2.0.11-py36_0 conda-forge
thinc: 5.0.8-py36_0 --> 6.10.1-py36hd61447b_0
The following packages will be DOWNGRADED:
cytoolz: 0.9.0.1-py36h14c3975_0 --> 0.8.2-py36_0 conda-forge
Proceed ([y]/n)?
Unfortunately, I don't think I have a way to look at the output from when I ran the update days ago. =( But my output must have looked similar to that.
Let me know if there's anything else I can do.
Quick update on my experiments: I used a clean VM, installed the latest version of the Anaconda distribution and tried to install spaCy using various commands. It always gave me v0.101.0, even though the latest versions are all available 🧐
Anaconda3-5.2.0-Linux-x86_64.shconda install spacy
conda install spacy -c conda-forge
conda install spacy -c anaconda
## Package Plan ##
environment location: /home/xxx/anaconda3
added / updated specs:
- spacy
The following packages will be downloaded:
package | build
---------------------------|-----------------
ujson-1.35 | py36h14c3975_0 26 KB
murmurhash-0.26.4 | py36_0 32 KB
preshed-0.46.4 | py36_0 211 KB
conda-4.5.8 | py36_0 1.0 MB
semver-2.8.0 | py36_0 11 KB
sputnik-0.9.3 | py36_0 45 KB
plac-0.9.6 | py36_0 36 KB
cymem-1.31.2 | py36h6bb024c_0 26 KB
spacy-0.101.0 | py36_0 5.7 MB
thinc-5.0.8 | py36_0 1.3 MB
------------------------------------------------------------
Total: 8.4 MB
The following NEW packages will be INSTALLED:
cymem: 1.31.2-py36h6bb024c_0
murmurhash: 0.26.4-py36_0
plac: 0.9.6-py36_0
preshed: 0.46.4-py36_0
semver: 2.8.0-py36_0
spacy: 0.101.0-py36_0
sputnik: 0.9.3-py36_0
thinc: 5.0.8-py36_0
ujson: 1.35-py36h14c3975_0
The following packages will be UPDATED:
conda: 4.5.4-py36_0 --> 4.5.8-py36_0
conda list after clean installationThis includes everything that was installed via the distribution. It doesn't include spaCy.
# packages in environment at /home/prodigy/anaconda3:
#
# Name Version Build Channel
_ipyw_jlab_nb_ext_conf 0.1.0 py36he11e457_0
alabaster 0.7.10 py36h306e16b_0
anaconda 5.2.0 py36_3
anaconda-client 1.6.14 py36_0
anaconda-navigator 1.8.7 py36_0
anaconda-project 0.8.2 py36h44fb852_0
asn1crypto 0.24.0 py36_0
astroid 1.6.3 py36_0
astropy 3.0.2 py36h3010b51_1
attrs 18.1.0 py36_0
babel 2.5.3 py36_0
backcall 0.1.0 py36_0
backports 1.0 py36hfa02d7e_1
backports.shutil_get_terminal_size 1.0.0 py36hfea85ff_2
beautifulsoup4 4.6.0 py36h49b8c8c_1
bitarray 0.8.1 py36h14c3975_1
bkcharts 0.2 py36h735825a_0
blas 1.0 mkl
blaze 0.11.3 py36h4e06776_0
bleach 2.1.3 py36_0
blosc 1.14.3 hdbcaa40_0
bokeh 0.12.16 py36_0
boto 2.48.0 py36h6e4cd66_1
bottleneck 1.2.1 py36haac1ea0_0
bzip2 1.0.6 h14c3975_5
ca-certificates 2018.03.07 0
cairo 1.14.12 h7636065_2
certifi 2018.4.16 py36_0
cffi 1.11.5 py36h9745a5d_0
chardet 3.0.4 py36h0f667ec_1
click 6.7 py36h5253387_0
cloudpickle 0.5.3 py36_0
clyent 1.2.2 py36h7e57e65_1
colorama 0.3.9 py36h489cec4_0
conda 4.5.4 py36_0
conda-build 3.10.5 py36_0
conda-env 2.6.0 h36134e3_1
conda-verify 2.0.0 py36h98955d8_0
contextlib2 0.5.5 py36h6c84a62_0
cryptography 2.2.2 py36h14c3975_0
curl 7.60.0 h84994c4_0
cycler 0.10.0 py36h93f1223_0
cython 0.28.2 py36h14c3975_0
cytoolz 0.9.0.1 py36h14c3975_0
dask 0.17.5 py36_0
dask-core 0.17.5 py36_0
datashape 0.5.4 py36h3ad6b5c_0
dbus 1.13.2 h714fa37_1
decorator 4.3.0 py36_0
distributed 1.21.8 py36_0
docutils 0.14 py36hb0f60f5_0
entrypoints 0.2.3 py36h1aec115_2
et_xmlfile 1.0.1 py36hd6bccc3_0
expat 2.2.5 he0dffb1_0
fastcache 1.0.2 py36h14c3975_2
filelock 3.0.4 py36_0
flask 1.0.2 py36_1
flask-cors 3.0.4 py36_0
fontconfig 2.12.6 h49f89f6_0
freetype 2.8 hab7d2ae_1
get_terminal_size 1.0.0 haa9412d_0
gevent 1.3.0 py36h14c3975_0
glib 2.56.1 h000015b_0
glob2 0.6 py36he249c77_0
gmp 6.1.2 h6c8ec71_1
gmpy2 2.0.8 py36hc8893dd_2
graphite2 1.3.11 h16798f4_2
greenlet 0.4.13 py36h14c3975_0
gst-plugins-base 1.14.0 hbbd80ab_1
gstreamer 1.14.0 hb453b48_1
h5py 2.7.1 py36ha1f6525_2
harfbuzz 1.7.6 h5f0a787_1
hdf5 1.10.2 hba1933b_1
heapdict 1.0.0 py36_2
html5lib 1.0.1 py36h2f9c1c0_0
icu 58.2 h9c2bf20_1
idna 2.6 py36h82fb2a8_1
imageio 2.3.0 py36_0
imagesize 1.0.0 py36_0
intel-openmp 2018.0.0 8
ipykernel 4.8.2 py36_0
ipython 6.4.0 py36_0
ipython_genutils 0.2.0 py36hb52b0d5_0
ipywidgets 7.2.1 py36_0
isort 4.3.4 py36_0
itsdangerous 0.24 py36h93cc618_1
jbig 2.1 hdba287a_0
jdcal 1.4 py36_0
jedi 0.12.0 py36_1
jinja2 2.10 py36ha16c418_0
jpeg 9b h024ee3a_2
jsonschema 2.6.0 py36h006f8b5_0
jupyter 1.0.0 py36_4
jupyter_client 5.2.3 py36_0
jupyter_console 5.2.0 py36he59e554_1
jupyter_core 4.4.0 py36h7c827e3_0
jupyterlab 0.32.1 py36_0
jupyterlab_launcher 0.10.5 py36_0
kiwisolver 1.0.1 py36h764f252_0
lazy-object-proxy 1.3.1 py36h10fcdad_0
libcurl 7.60.0 h1ad7b7a_0
libedit 3.1.20170329 h6b74fdf_2
libffi 3.2.1 hd88cf55_4
libgcc-ng 7.2.0 hdf63c60_3
libgfortran-ng 7.2.0 hdf63c60_3
libpng 1.6.34 hb9fc6fc_0
libsodium 1.0.16 h1bed415_0
libssh2 1.8.0 h9cfc8f7_4
libstdcxx-ng 7.2.0 hdf63c60_3
libtiff 4.0.9 he85c1e1_1
libtool 2.4.6 h544aabb_3
libxcb 1.13 h1bed415_1
libxml2 2.9.8 h26e45fe_1
libxslt 1.1.32 h1312cb7_0
llvmlite 0.23.1 py36hdbcaa40_0
locket 0.2.0 py36h787c0ad_1
lxml 4.2.1 py36h23eabaa_0
lzo 2.10 h49e0be7_2
markupsafe 1.0 py36hd9260cd_1
matplotlib 2.2.2 py36h0e671d2_1
mccabe 0.6.1 py36h5ad9710_1
mistune 0.8.3 py36h14c3975_1
mkl 2018.0.2 1
mkl-service 1.1.2 py36h17a0993_4
mkl_fft 1.0.1 py36h3010b51_0
mkl_random 1.0.1 py36h629b387_0
more-itertools 4.1.0 py36_0
mpc 1.0.3 hec55b23_5
mpfr 3.1.5 h11a74b3_2
mpmath 1.0.0 py36hfeacd6b_2
msgpack-python 0.5.6 py36h6bb024c_0
multipledispatch 0.5.0 py36_0
navigator-updater 0.2.1 py36_0
nbconvert 5.3.1 py36hb41ffb7_0
nbformat 4.4.0 py36h31c9010_0
ncurses 6.1 hf484d3e_0
networkx 2.1 py36_0
nltk 3.3.0 py36_0
nose 1.3.7 py36hcdf7029_2
notebook 5.5.0 py36_0
numba 0.38.0 py36h637b7d7_0
numexpr 2.6.5 py36h7bf3b9c_0
numpy 1.14.3 py36hcd700cb_1
numpy-base 1.14.3 py36h9be14a7_1
numpydoc 0.8.0 py36_0
odo 0.5.1 py36h90ed295_0
olefile 0.45.1 py36_0
openpyxl 2.5.3 py36_0
openssl 1.0.2o h20670df_0
packaging 17.1 py36_0
pandas 0.23.0 py36h637b7d7_0
pandoc 1.19.2.1 hea2e7c5_1
pandocfilters 1.4.2 py36ha6701b7_1
pango 1.41.0 hd475d92_0
parso 0.2.0 py36_0
partd 0.3.8 py36h36fd896_0
patchelf 0.9 hf79760b_2
path.py 11.0.1 py36_0
pathlib2 2.3.2 py36_0
patsy 0.5.0 py36_0
pcre 8.42 h439df22_0
pep8 1.7.1 py36_0
pexpect 4.5.0 py36_0
pickleshare 0.7.4 py36h63277f8_0
pillow 5.1.0 py36h3deb7b8_0
pip 10.0.1 py36_0
pixman 0.34.0 hceecf20_3
pkginfo 1.4.2 py36_1
pluggy 0.6.0 py36hb689045_0
ply 3.11 py36_0
prompt_toolkit 1.0.15 py36h17d85b1_0
psutil 5.4.5 py36h14c3975_0
ptyprocess 0.5.2 py36h69acd42_0
py 1.5.3 py36_0
pycodestyle 2.4.0 py36_0
pycosat 0.6.3 py36h0a5515d_0
pycparser 2.18 py36hf9f622e_1
pycrypto 2.6.1 py36h14c3975_8
pycurl 7.43.0.1 py36hb7f436b_0
pyflakes 1.6.0 py36h7bd6a15_0
pygments 2.2.0 py36h0d3125c_0
pylint 1.8.4 py36_0
pyodbc 4.0.23 py36hf484d3e_0
pyopenssl 18.0.0 py36_0
pyparsing 2.2.0 py36hee85983_1
pyqt 5.9.2 py36h751905a_0
pysocks 1.6.8 py36_0
pytables 3.4.3 py36h02b9ad4_2
pytest 3.5.1 py36_0
pytest-arraydiff 0.2 py36_0
pytest-astropy 0.3.0 py36_0
pytest-doctestplus 0.1.3 py36_0
pytest-openfiles 0.3.0 py36_0
pytest-remotedata 0.2.1 py36_0
python 3.6.5 hc3d631a_2
python-dateutil 2.7.3 py36_0
pytz 2018.4 py36_0
pywavelets 0.5.2 py36he602eb0_0
pyyaml 3.12 py36hafb9ca4_1
pyzmq 17.0.0 py36h14c3975_0
qt 5.9.5 h7e424d6_0
qtawesome 0.4.4 py36h609ed8c_0
qtconsole 4.3.1 py36h8f73b5b_0
qtpy 1.4.1 py36_0
readline 7.0 ha6073c6_4
requests 2.18.4 py36he2e5f8d_1
rope 0.10.7 py36h147e2ec_0
ruamel_yaml 0.15.35 py36h14c3975_1
scikit-image 0.13.1 py36h14c3975_1
scikit-learn 0.19.1 py36h7aa7ec6_0
scipy 1.1.0 py36hfc37229_0
seaborn 0.8.1 py36hfad7ec4_0
send2trash 1.5.0 py36_0
setuptools 39.1.0 py36_0
simplegeneric 0.8.1 py36_2
singledispatch 3.4.0.3 py36h7a266c3_0
sip 4.19.8 py36hf484d3e_0
six 1.11.0 py36h372c433_1
snappy 1.1.7 hbae5bb6_3
snowballstemmer 1.2.1 py36h6febd40_0
sortedcollections 0.6.1 py36_0
sortedcontainers 1.5.10 py36_0
sphinx 1.7.4 py36_0
sphinxcontrib 1.0 py36h6d0f590_1
sphinxcontrib-websupport 1.0.1 py36hb5cb234_1
spyder 3.2.8 py36_0
sqlalchemy 1.2.7 py36h6b74fdf_0
sqlite 3.23.1 he433501_0
statsmodels 0.9.0 py36h3010b51_0
sympy 1.1.1 py36hc6d1c1c_0
tblib 1.3.2 py36h34cf8b6_0
terminado 0.8.1 py36_1
testpath 0.3.1 py36h8cadb63_0
tk 8.6.7 hc745277_3
toolz 0.9.0 py36_0
tornado 5.0.2 py36_0
traitlets 4.3.2 py36h674d592_0
typing 3.6.4 py36_0
unicodecsv 0.14.1 py36ha668878_0
unixodbc 2.3.6 h1bed415_0
urllib3 1.22 py36hbe7ace6_0
wcwidth 0.1.7 py36hdf4376a_0
webencodings 0.5.1 py36h800622e_1
werkzeug 0.14.1 py36_0
wheel 0.31.1 py36_0
widgetsnbextension 3.2.1 py36_0
wrapt 1.10.11 py36h28b7045_0
xlrd 1.1.0 py36h1db9f0c_1
xlsxwriter 1.0.4 py36_0
xlwt 1.3.0 py36h7b00a1f_0
xz 5.2.4 h14c3975_4
yaml 0.1.7 had09818_2
zeromq 4.2.5 h439df22_0
zict 0.1.3 py36h3a3bf81_0
zlib 1.2.11 ha838bed_2
This is pretty worrying... I'll now try to install all individual dependencies to see if anything looks suspicious.
spacy==2.0.11The following packages will be REMOVED:
anaconda: 5.2.0-py36_3
Conclusion: The most interesting part here is that anaconda removes itself. So something must be make it think that it's incompatible with the latest spaCy. And for some reason, the super old version is then preferred...
Update: After reading through condas debugging logs, I think we found the culprit – msgpack-python. If the Anaconda distribution was recently updated to include the new msgpack, this would also explain why this problem only started to come up recently. msgpack is incompatible with msgpack-python and the "latest" spaCy version that is compatible is 0.101.0, because it didn't depend on any of those.
The bad news: As a result, installing spaCy from the recent Anaconda distribution will currently download 0.101.0 (which is now about two years old?). This happens pretty much always.
The good news: We can fix this by publishing a patch release to Thinc that requires msgpack.
Thanks, ines, for analyzing this problem! I saw that you worked on the patch, but when I do my "conda install -c conda-forge spacy" conda is still determined to install spacy 0.101.0-py36_0. Is your patch public and available through the conda-forge channel?
We are about to introduce students to text mining and want to bring them in contact with the power of SpaCy. :-)
Edit: After a "conda update --all" spacy: 2.0.11-py36hf8a1672_1 is offered. Perfect! Thanks a million!
"The good news: We can fix this by publishing a patch release to Thinc that requires msgpack."
@ines : Is this issue fixed now? I still have the same issue
I installed using conda install spacy, got the old version. conda update --all didn't help, so i did conda uninstall spacy, and then pip install spacy. solved the issue.
We published new versions of spaCy and an updated recipe for Thinc on conda-forge, so the following should now work as expected:
conda install spacy -c conda-forge
I just tested it on a VM using Ubuntu and a clean install of the latest Anaconda distribution. Sorry this took so long – this was really frustrating to debug 😭
Once the official Anaconda channel updates their builds of spaCy and Thinc, the problem should also be resolved when you only run conda install spacy. We don't have control over that one, though, but it usually happens pretty quickly.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Most helpful comment
Update: After reading through condas debugging logs, I think we found the culprit –
msgpack-python. If the Anaconda distribution was recently updated to include the newmsgpack, this would also explain why this problem only started to come up recently.msgpackis incompatible withmsgpack-pythonand the "latest" spaCy version that is compatible is0.101.0, because it didn't depend on any of those.The bad news: As a result, installing spaCy from the recent Anaconda distribution will currently download
0.101.0(which is now about two years old?). This happens pretty much always.The good news: We can fix this by publishing a patch release to Thinc that requires
msgpack.