I have successfully built and installed spaCy 0.100.7 with OpenMP on OS X for brew installed python. For the impatient among us, here are instructions on how to do it yourself.
Choose your adventure:
brew install pythonllvm-3.8clang-ompsetup.py, then build and test spacyNote that these instructions are for brew installed python.
Install the binaries:
For example, to install at /opt/llvm38:
tar xJf clang+llvm-3.8.0-x86_64-apple-darwin.tar.xz
sudo mkdir -p /opt
sudo mv clang+llvm-3.8.0-x86_64-apple-darwin /opt/llvm38
Tell pip to use clang-3.8:
export CC=/opt/llvm38/bin/clang
export CXX=/opt/llvm38/bin/clang++
export PATH=/opt/llvm38/bin:$PATH
export C_INCLUDE_PATH=/opt/llvm38/include:$C_INCLUDE_PATH
export CPLUS_INCLUDE_PATH=/opt/llvm38/include:$CPLUS_INCLUDE_PATH
export LIBRARY_PATH=/opt/llvm38/lib:$LIBRARY_PATH
export DYLD_LIBRARY_PATH=/opt/llvm38/lib:$DYLD_LIBRARY_PATH
Install clang-omp Homebrew:
brew install clang-omp
Tell pip to use clang-omp:
export CC=clang-omp
export CXX=clang-omp
export PATH=/usr/local/bin:$PATH
export C_INCLUDE_PATH=/usr/local/include/libiomp:$C_INCLUDE_PATH
export CPLUS_INCLUDE_PATH=/usr/local/include/libiomp:$CPLUS_INCLUDE_PATH
export LIBRARY_PATH=/usr/local/lib:$LIBRARY_PATH
export DYLD_LIBRARY_PATH=/usr/local/lib:$DYLD_LIBRARY_PATH
Install GCC via Homebrew:
brew install gcc --without-multilib
The --without-multilib option is required for OpenMP support.
Tell pip to use GCC:
export CC=gcc-5
export CXX=g++-5
WARNING: Compiling with GCC as of spaCy 0.100.5 may result in a segfault (#266).
setup.py and installFollow the 'Compile from source' instructions from spaCy documentation, with the following adjustments.
git clone https://github.com/honnibal/spaCy.git
cd spaCy
git checkout 0.100.6 # or 'master' if you wish
Edit setup.py lines 88-90 to enable OpenMP:
# if not sys.platform.startswith('darwin'):
compile_options['other'].append('-fopenmp')
link_options['other'].append('-fopenmp')
Now continue with the install instuctions as per the documentation.
virtualenv .env && source .env/bin/activate
export PYTHONPATH=`pwd`
pip install -r requirements.txt
python setup.py clean
pip install -e .
python -m spacy.en.download
pip install pytest
py.test spacy/tests/
To install spaCy outside of virtualenv and/or outside the source directory:
virtualenv using the deactivate command.pip install . in the source directory.export DYLD_LIBRARY_PATH=/usr/local/lib:$DYLD_LIBRARY_PATHexport DYLD_LIBRARY_PATH=/opt/llvm38/lib:$DYLD_LIBRARY_PATHRelevant resources:
Thank you for the help @honnibal @henningpeters @gushecht !
History:
deactivate virtualenvpip install -e . instead of python setup.py build_ext --inplaceThanks for the pointers and making it work.
I am not a big fan of these silent up/downgrades depending on what dependencies you had installed. PIL comes to my mind as a particular bad example of this style. I would rather prefer passing some args/envs to setup.py and fail if dependencies are not available or ship a separate package. We'll definitely look into making this more accessible.
LLVM since version 3.7 added native OpenMP support, so the best solution may just be to wait until Apple releases the next version of XCode this summer (or thereabouts).
I can confirm clang-omp to work. We had another thread some time ago about compilation problems with gcc on OSX (#237). Its unfortunate, but there is very little support on detecting compilers with setuptools and even if you could do that reliably you might still be out of luck in case the user's Python got built with non-gcc flags (AFAIK they are not visible from within setup.py). What I want to say is: supporting gcc on osx is really tough while not being that much appreciated in case it does work.
Great news on clang-omp. I've been experimenting with many native extensions for Python, R, and NodeJS. I've had so many issues with native extensions and compilers that I now just install whichever compiler works best... Once Apple updates XCode with LLVM 3.7, there will be much less need for supporting GCC on OS X.
Practically speaking, the clang-omp solution works for me and I'm okay leaving it at that for now. I'm sure you guys have plenty of other interesting improvements to make to spaCy!
Hello again @mikepb
I got to python setup.py build_ext --inplace and then it failed. Screenshot attached. Many thanks, as always.

Also, for what it's worth, I found that I had to downgrade to Python 2.7.9 from 2.7.11
I see you are using the Anaconda Python. I wrote the instructions for the homebrew Python. What is the compiler used to build Anaconda Python?
Just running the interpreter on my machine printed this message:
Python 2.7.11 (default, Feb 18 2016, 14:32:04)
[GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
If you are using GGC-built Python, try using GCC to build spacy instead:
brew install gcc --without-multilib
export CC=gcc-5
export CXX=g++-5
I've been told that the gcc --without-multilib option is required for the OpenMP support used by spacy.
If you are successful, please let us know!
Remember to reset your build environment as well. Running python setup.py clean then using a new terminal session should do the trick.
Update: tried with GCC and successfully got the segfault :)
So then I tried with homebrew Python. Please note that I was getting an issue trying to clone the repo after running the export statements, so I revised the order a little:
brew install python
brew install clang-omp
git clone https://github.com/honnibal/spaCy.git
cd spaCy
git checkout 0.100.5
# Made the changes you specified to setup.py
export CC=clang-omp
export CXX=clang-omp
export PATH=/usr/local/bin:$PATH
export C_INCLUDE_PATH=/usr/local/include/libiomp:$C_INCLUDE_PATH
export CPLUS_INCLUDE_PATH=/usr/local/include/libiomp:$CPLUS_INCLUDE_PATH
export LIBRARY_PATH=/usr/local/lib:$LIBRARY_PATH
export DYLD_LIBRARY_PATH=/usr/local/lib:$DYLD_LIBRARY_PATH
virtualenv .env && source .env/bin/activate
export PYTHONPATH=`pwd`
pip install -r requirements.txt
python setup.py clean
python setup.py build_ext --inplace
And then....when downloading the English pack
(.env) Guss-MacBook-Pro:spaCy gushecht$ python -m spacy.en.download
/Users/gushecht/spaCy/.env/bin/python: dlopen(spacy/tokenizer.so, 2): Symbol not found: __ZTINSt8ios_base7failureE
Referenced from: spacy/tokenizer.so
Expected in: dynamic lookup
Any thoughts?
Perhaps at this point it makes more sense for me to spin up a Linux box on AWS and bypass the issue.
I actually downloaded the English pack using pip-installed spacy and symlinked the directory. I'm on a slow connection and neglected to test that command. Symlinking the data dir might work.
I've used the spot instances for cheap, but expect your instance to be killed during peak hours.
As an alternative to python -m spacy.en.download you can also run python -m sputnik --name spacy install en. But a symbol not found error looks like your installation didn't finish properly or is in a half-baked state. Can you run python setup.py clean and install again?
Btw: python setup.py build_ext --inplace is outdated. Please run pip install -e . for a development install.
Hey all, looks like pip install -e . was the magic formula!
For final reference:
brew install python
brew install clang-omp
git clone https://github.com/honnibal/spaCy.git
cd spaCy
git checkout 0.100.5
# Make the specified changes to setup.py
export CC=clang-omp
export CXX=clang-omp
export PATH=/usr/local/bin:$PATH
export C_INCLUDE_PATH=/usr/local/include/libiomp:$C_INCLUDE_PATH
export CPLUS_INCLUDE_PATH=/usr/local/include/libiomp:$CPLUS_INCLUDE_PATH
export LIBRARY_PATH=/usr/local/lib:$LIBRARY_PATH
export DYLD_LIBRARY_PATH=/usr/local/lib:$DYLD_LIBRARY_PATH
virtualenv .env && source .env/bin/activate
export PYTHONPATH=`pwd`
pip install -r requirements.txt
python setup.py clean
pip install -e .
Thanks so much to each of you.
Thanks for the update :+1: I've updated the issue description to use pip install -e .
Updated instructions with LLVM-3.8 option for native OpenMP support.
@henningpeters I managed to build the latest spacy master with LLVM-3.8 and OpenMP support as a binary wheel. The hack-patch for setup.py and the corresponding binary wheel is attached. If you have the time, please let me know if the wheel works!
@mikepb : Took forever to circle back to this. Much appreciated.
I've integrated your patch into the setup.py here: https://github.com/explosion/spaCy/commit/36bcd46244f4167eca32b93e4fd43eccab6844bb
I'm running a bit blind on this, so hopefully I haven't done anything wrong. We're not fully supporting wheels at the moment, due to resource constraints.
@gushecht: Thanks a lot for your snippet. I've referenced this thread in the install docs for MacOS / OSX.
I'm closing this for now to signal that I'm not aware of further action here. Not 100% confident this is fully resolved yet, as I don't feel very across the issue. Please reopen if there's more to do.
As per today clang-omp is deprecated and moved to homebrew/boneyard/clang-omp. You should use brew install llvm instead.
Environment variables I used before pip installation:
export CC=/usr/local/opt/llvm/bin/clang
export CXX=/usr/local/opt/llvm/bin/clang++
export PATH=/usr/local/opt/llvm/bin:$PATH
export C_INCLUDE_PATH=/usr/local/opt/llvm/include:$C_INCLUDE_PATH
export CPLUS_INCLUDE_PATH=/usr/local/opt/llvm/include:$CPLUS_INCLUDE_PATH
export LIBRARY_PATH=/usr/local/opt/llvm/lib:$LIBRARY_PATH
export DYLD_LIBRARY_PATH=/usr/local/opt/llvm/lib:$DYLD_LIBRARY_PATH
Got about 2X speed up after this.
The patch seems to be "wrong" (maybe I'm missing something, therefore the quotes).
The PACKAGES.append line makes the installation fail with this error:
Obtaining file:///private/tmp/spaCy
Complete output from command python setup.py egg_info:
warning: ner.pyx:131:29: Not all members given for struct 'Transition'
warning: ner.pyx:131:29: Not all members given for struct 'Transition'
Processing attrs.pyx
Processing cfile.pyx
Processing gold.pyx
Processing lexeme.pyx
Processing matcher.pyx
Processing morphology.pyx
Processing orth.pyx
Processing parts_of_speech.pyx
Processing pipeline.pyx
Processing strings.pyx
Processing symbols.pyx
Processing tagger.pyx
Processing tokenizer.pyx
Processing typedefs.pyx
Processing vocab.pyx
Processing bits.pyx
Processing huffman.pyx
Processing packer.pyx
Processing _parse_features.pyx
Processing _state.pyx
Processing arc_eager.pyx
Processing beam_parser.pyx
Processing iterators.pyx
Processing ner.pyx
Processing nonproj.pyx
Processing parser.pyx
Processing stateclass.pyx
Processing transition_system.pyx
Processing doc.pyx
Processing span.pyx
Processing token.pyx
Cythonizing sources
running egg_info
writing spacy.egg-info/PKG-INFO
writing dependency_links to spacy.egg-info/dependency_links.txt
writing requirements to spacy.egg-info/requires.txt
writing top-level names to spacy.egg-info/top_level.txt
error: package directory 'spacy/platform/darwin/lib' does not exist
Removing the line, instead, enables building and installation of the multithreading version of spaCy. I still find an issue when specifying -1 or a number larger than the number of cores on my system as the number of threads, but I can live with that and to me is way more important ensuring the multithreaded version is stable (as in "it builds like it should").
I saw that directory in the egg from @mikepb, but that's about it.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Most helpful comment
Hey all, looks like
pip install -e .was the magic formula!For final reference:
Thanks so much to each of you.