Gensim: Can't import gensim in Windows

Created on 25 Dec 2018  路  14Comments  路  Source: RaRe-Technologies/gensim

Description

I have a problem importing gensim, I use Anaconda in Windows (separate virtualenv), find below the traceback and the dependencies versions.

Steps/Code/Corpus to Reproduce


import gensim

Actual Results

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\hatem\Anaconda3\envs\crashsim\lib\site-packages\gensim\__init__.py", line 5, in <module>
    from gensim import parsing, corpora, matutils, interfaces, models, similarities, summarization, utils  # noqa:F401
  File "C:\Users\hatem\Anaconda3\envs\crashsim\lib\site-packages\gensim\models\__init__.py", line 15, in <module>
    from .doc2vec import Doc2Vec  # noqa:F401
  File "C:\Users\hatem\Anaconda3\envs\crashsim\lib\site-packages\gensim\models\doc2vec.py", line 86, in <module>
    from gensim.models.doc2vec_inner import train_document_dbow, train_document_dm, train_document_dm_concat
  File "__init__.pxd", line 872, in init gensim.models.doc2vec_inner
ValueError: numpy.ufunc has the wrong size, try recompiling. Expected 192, got 216

Versions


Windows-10-10.0.15063-SP0
Python 3.7.1
NumPy 1.15.4
SciPy 1.2.0
gensim 3.6.0

Most helpful comment

We have similar problem with our setup:

python 2.7,  ubuntu 18.04

numpy==1.14.2
scipy==1.2.0
gensim==3.6.0   # or previous versions 

everything in a docker setup, pip requirements installed from a file.
Building a docker image fails with the error as above (complaining about numpy.ufunc).
However, if we do pip install gensim after the build is made, it works okey.

Could the issue be related to a pip cache?
The whole setup works when we added --no-cache-dir in the pip install options while building the docker image.

All 14 comments

My conda list output, for a detailed look on versions:

asn1crypto                0.24.0                    <pip>
astroid                   2.1.0                    py37_0
atomicwrites              1.2.1                     <pip>
attrs                     18.2.0                    <pip>
boto                      2.49.0                    <pip>
boto3                     1.9.71                    <pip>
botocore                  1.12.71                   <pip>
bz2file                   0.98                      <pip>
ca-certificates           2018.03.07                    0
certifi                   2018.11.29               py37_0
cffi                      1.11.5                    <pip>
chardet                   3.0.4                     <pip>
colorama                  0.4.1                    py37_0
coverage                  4.5.2                     <pip>
cryptography              2.4.2                     <pip>
Cython                    0.29.1                    <pip>
distro                    1.3.0                     <pip>
docutils                  0.14                      <pip>
flake8                    3.6.0                     <pip>
gensim                    3.6.0                     <pip>
google-compute-engine     2.8.12                    <pip>
idna                      2.8                       <pip>
isort                     4.3.4                    py37_0
jmespath                  0.9.3                     <pip>
lazy-object-proxy         1.3.1            py37hfa6e2cd_2
mccabe                    0.6.1                    py37_1
more-itertools            4.3.0                     <pip>
numpy                     1.15.4                    <pip>
openssl                   1.1.1a               he774522_0
pip                       18.1                     py37_0
pluggy                    0.8.0                     <pip>
py                        1.7.0                     <pip>
pycodestyle               2.4.0                     <pip>
pycparser                 2.19                      <pip>
pyemd                     0.5.1                     <pip>
pyflakes                  2.0.0                     <pip>
pylint                    2.2.2                    py37_0
pyOpenSSL                 18.0.0                    <pip>
pytest                    4.0.1                     <pip>
python                    3.7.1                h8c8aaf0_6
python-dateutil           2.7.5                     <pip>
requests                  2.21.0                    <pip>
requests-mock             1.5.2                     <pip>
s3transfer                0.1.13                    <pip>
scipy                     1.2.0                     <pip>
setuptools                40.6.3                   py37_0
six                       1.12.0                    <pip>
six                       1.12.0                   py37_0
smart-open                1.7.1                     <pip>
sqlite                    3.26.0               he774522_0
urllib3                   1.24.1                    <pip>
vc                        14.1                 h0510ff6_4
vs2015_runtime            14.15.26706          h3a45250_0
wheel                     0.32.3                   py37_0
wincertstore              0.2                      py37_0
wrapt                     1.10.11          py37hfa6e2cd_2

Hi @ibrahimsharaf, try to uninstall numpy, scipy and gensim and install gensim again

Same problem? @menshikh-iv

C:\Users\hatem\Anaconda3\envs\crashsim\lib\site-packages\gensim\utils.py:1212: UserWarning: detected Windows; aliasing chunkize to chunkize_serial
  warnings.warn("detected Windows; aliasing chunkize to chunkize_serial")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\hatem\Anaconda3\envs\crashsim\lib\site-packages\gensim\__init__.py", line 5, in <module>
    from gensim import parsing, corpora, matutils, interfaces, models, similarities, summarization, utils  # noqa:F401
  File "C:\Users\hatem\Anaconda3\envs\crashsim\lib\site-packages\gensim\corpora\__init__.py", line 6, in <module>
    from .indexedcorpus import IndexedCorpus  # noqa:F401 must appear before the other classes
  File "C:\Users\hatem\Anaconda3\envs\crashsim\lib\site-packages\gensim\corpora\indexedcorpus.py", line 15, in <module>
    from gensim import interfaces, utils
  File "C:\Users\hatem\Anaconda3\envs\crashsim\lib\site-packages\gensim\interfaces.py", line 21, in <module>
    from gensim import utils, matutils
  File "C:\Users\hatem\Anaconda3\envs\crashsim\lib\site-packages\gensim\matutils.py", line 1076, in <module>
    from gensim._matutils import logsumexp, mean_absolute_difference, dirichlet_expectation
  File "__init__.pxd", line 872, in init gensim._matutils
ValueError: numpy.ufunc has the wrong size, try recompiling. Expected 192, got 216

I have the same exact problem running Python 3.6 on a docker container with Django + Gensim.

Gensim==3.6.0
Numpy==1.14.2

Steps to replicate

  1. install Gensim==3.6.0 and Numpy==1.14.2
  2. Import gensim.models

Stack Trace

Unhandled exception in thread started by <function check_errors.<locals>.wrapper at 0x7f9611305c80>
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/django/utils/autoreload.py", line 225, in wrapper
    fn(*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/django/core/management/commands/runserver.py", line 109, in inner_run
    autoreload.raise_last_exception()
  File "/usr/local/lib/python3.6/site-packages/django/utils/autoreload.py", line 248, in raise_last_exception
    raise _exception[1]
  File "/usr/local/lib/python3.6/site-packages/django/core/management/__init__.py", line 337, in execute
    autoreload.check_errors(django.setup)()
  File "/usr/local/lib/python3.6/site-packages/django/utils/autoreload.py", line 225, in wrapper
    fn(*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/django/__init__.py", line 24, in setup
    apps.populate(settings.INSTALLED_APPS)
  File "/usr/local/lib/python3.6/site-packages/django/apps/registry.py", line 112, in populate
    app_config.import_models()
  File "/usr/local/lib/python3.6/site-packages/django/apps/config.py", line 198, in import_models
    self.models_module = import_module(models_module_name)
  File "/usr/local/lib/python3.6/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 994, in _gcd_import
  File "<frozen importlib._bootstrap>", line 971, in _find_and_load
  File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 678, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/code/wordvectors/models.py", line 9, in <module>
    from wordvectors.utilities.vector_factory import VectorFactory
  File "/code/wordvectors/utilities/vector_factory.py", line 4, in <module>
    from gensim.models import KeyedVectors
  File "/usr/local/lib/python3.6/site-packages/gensim/__init__.py", line 5, in <module>
    from gensim import parsing, corpora, matutils, interfaces, models, similarities, summarization, utils  # noqa:F401
  File "/usr/local/lib/python3.6/site-packages/gensim/models/__init__.py", line 7, in <module>
    from .coherencemodel import CoherenceModel  # noqa:F401
  File "/usr/local/lib/python3.6/site-packages/gensim/models/coherencemodel.py", line 36, in <module>
    from gensim.topic_coherence import (segmentation, probability_estimation,
  File "/usr/local/lib/python3.6/site-packages/gensim/topic_coherence/probability_estimation.py", line 12, in <module>
    from gensim.topic_coherence.text_analysis import (
  File "/usr/local/lib/python3.6/site-packages/gensim/topic_coherence/text_analysis.py", line 21, in <module>
    from gensim.models.word2vec import Word2Vec
  File "/usr/local/lib/python3.6/site-packages/gensim/models/word2vec.py", line 121, in <module>
    from gensim.models.keyedvectors import Vocab, Word2VecKeyedVectors
  File "/usr/local/lib/python3.6/site-packages/gensim/models/keyedvectors.py", line 162, in <module>
    from pyemd import emd
  File "/usr/local/lib/python3.6/site-packages/pyemd/__init__.py", line 75, in <module>
    from .emd import emd, emd_with_flow, emd_samples
  File "__init__.pxd", line 885, in init pyemd.emd
ValueError: numpy.ufunc has the wrong size, try recompiling. Expected 192, got 216

+1

@Vichoko sorry, it works for me (no exceptions). Can you try to uninstall pyemd and try again?
@ibrahimsharaf How exactly you install gensim (pip or conda?), can you enable verbosity and show log of the installation process? Exact steps for reproducing an error can be really useful!

@menshikh-iv Minimum reproducible case on my end:

Dockerfile

FROM python:3
ADD app.py /
RUN pip install gensim
CMD [ "python", "./app.py" ]

app.py

import gensim

Compile & Run (which should yield the numpy error):

docker build -t gensim-import-error .
docker run gensim-import-error

I uninstalled numpy, scipy, gensim and reinstalled gensim again using conda install genism instead of pip install gensim and now everything works fine, but the newly installed gensim version is 3.4.0 instead of 3.6.0.

+1

We have similar problem with our setup:

python 2.7,  ubuntu 18.04

numpy==1.14.2
scipy==1.2.0
gensim==3.6.0   # or previous versions 

everything in a docker setup, pip requirements installed from a file.
Building a docker image fails with the error as above (complaining about numpy.ufunc).
However, if we do pip install gensim after the build is made, it works okey.

Could the issue be related to a pip cache?
The whole setup works when we added --no-cache-dir in the pip install options while building the docker image.

@ibrahimsharaf @adibo @Gilgames000 @piercefreeman @acapello @Vichoko here I see several problems

  1. Missing files in sdist (already fixed, see https://github.com/RaRe-Technologies/gensim/pull/2194, will be released at the end of Jan & builded package for conda)
  2. Different numpy versions used for building & compilation, if I follow @piercefreeman instruction and look into logs (added -vv to pip command), I see something like
  gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/tmp/pip-install-_w1jwmfi/gensim/gensim/models -I/usr/local/include/python3.7m -I/tmp/pip-install-_w1jwmfi/gensim/.eggs/numpy-1.16.0rc2-py3.7-linux-x86_64.egg/numpy/core/include -c ./gensim/models/word2vec_inner.c -o build/temp.linux-x86_64-3.7/./gensim/models/word2vec_inner.o
  In file included from /tmp/pip-install-_w1jwmfi/gensim/.eggs/numpy-1.16.0rc2-py3.7-linux-x86_64.egg/numpy/core/include/numpy/ndarraytypes.h:1822:0,
                   from /tmp/pip-install-_w1jwmfi/gensim/.eggs/numpy-1.16.0rc2-py3.7-linux-x86_64.egg/numpy/core/include/numpy/ndarrayobject.h:12,
                   from /tmp/pip-install-_w1jwmfi/gensim/.eggs/numpy-1.16.0rc2-py3.7-linux-x86_64.egg/numpy/core/include/numpy/arrayobject.h:4,
                   from ./gensim/models/word2vec_inner.c:567:
  /tmp/pip-install-_w1jwmfi/gensim/.eggs/numpy-1.16.0rc2-py3.7-linux-x86_64.egg/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: #warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
   #warning "Using deprecated NumPy API, disable it with " \

i.e. numpy==1.16.0rc2, but

```
Successfully installed ... numpy-1.15.4
````

so, the question - why that happens? If anybody has some ideas - please let me know!

Workarounds for (2)

  • Use --pre flag explicitly, in that case, you'll have numpy==1.16.0rc2 for both (build & install) -> no errors (pip install gensim --pre)
  • Install numpy first and gensim after (even without --pre flag), in that case - numpy==1.15.4 will be installed and used for both (build & install) -> no errors (pip install numpy && pip install gensim)
  • Waiting until numpy release 1.16.0 as "full release" and pip stops to use both numpy

In that case, I close this issue, if workarounds don't help - feel free to re-open and fill the issue with new details.

I tried all 2 workarounds and got the same errors.
In the python enviromented with the errors, i uninstalled gensim and numpy with pip uninstall.
Then tried the work arounds and got the same errors:

ValueError: numpy.ufunc has the wrong size, try recompiling. Expected 192, got 216

In this case i tried this in a docker image running in ubuntu.

Edit: After some time playing with the dockerfile finally made it.
The workaround was to install numpy before gensim as the workaround stated.
The difference with my previous aproach was that i fully deleted the docker image insted of using pip uninstall to remove numpy and gensim for reinstalling them later.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

volj1 picture volj1  路  4Comments

k0nserv picture k0nserv  路  3Comments

menshikh-iv picture menshikh-iv  路  4Comments

sairampillai picture sairampillai  路  3Comments

ahmedbhabbas picture ahmedbhabbas  路  4Comments