I have a question regarding GOMP, that I figured is related to gcc.
I am using this image as a base for some experiments. It seems that I am using an alright version (4.9.2) based on this unmodified image, but I get some strange behaviour with a number of Cython related packages.
Example 1
When running this on the unmodified jupyter/all-spark-notebook docker image, when attempting to run Python I get the following issue,
import lightfm
Traceback (most recent call last):
File "", line 1, in
File "/opt/conda/lib/python3.4/site-packages/lightfm/init.py", line 1, in
from .lightfm import LightFM
File "/opt/conda/lib/python3.4/site-packages/lightfm/lightfm.py", line 7, in
from .lightfm_fast import (CSRMatrix, FastLightFM,
ImportError: /opt/conda/lib/python3.4/site-packages/lightfm/lightfm_fast.cpython-34m.so:
undefined symbol: GOMP_parallel
Things I have tried:
from .lightfm_fast import (CSRMatrix, FastLightFM,
fit_logistic, predict_lightfm,
fit_warp, fit_bpr, fit_warp_kos)
to
from .lightfm_fast import (CSRMatrix, FastLightFM, fit_logistic, predict_lightfm, fit_warp, fit_bpr, fit_warp_kos).
Same error.
Also tried ".lightfm" to "lightfm" to change from relative import. Same error
Checking gcc and kernel versions: gcc 4.9.2 Ubuntu 14.04 Linux 00846c176840 3.13.0-67-generic #110-Ubuntu SMP Fri Oct 23 13:24:41 UTC 2015 x86_64 GNU/Linux
But I think if you just pull the docker image and do a pip install lightfm it should replicate the error precisely.
Example 2
When running pip install xgboost, the install succeeds, but when in notebook I attempt to do:
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
<ipython-input-2-afdaff4619ce> in <module>()
----> 1 import xgboost
/home/jovyan/.local/lib/python3.5/site-packages/xgboost-0.4-py3.5.egg/xgboost/__init__.py in <module>()
9 import os
10
---> 11 from .core import DMatrix, Booster
12 from .training import train, cv
13 from . import rabit
/home/jovyan/.local/lib/python3.5/site-packages/xgboost-0.4-py3.5.egg/xgboost/core.py in <module>()
81
82 # load the XGBoost library globally
---> 83 _LIB = _load_lib()
84
85 def _check_call(ret):
/home/jovyan/.local/lib/python3.5/site-packages/xgboost-0.4-py3.5.egg/xgboost/core.py in _load_lib()
75 if len(lib_path) == 0:
76 return None
---> 77 lib = ctypes.cdll.LoadLibrary(lib_path[0])
78 lib.XGBGetLastError.restype = ctypes.c_char_p
79 return lib
/opt/conda/lib/python3.5/ctypes/__init__.py in LoadLibrary(self, name)
423
424 def LoadLibrary(self, name):
--> 425 return self._dlltype(name)
426
427 cdll = LibraryLoader(CDLL)
/opt/conda/lib/python3.5/ctypes/__init__.py in __init__(self, name, mode, handle, use_errno, use_last_error)
345
346 if handle is None:
--> 347 self._handle = _dlopen(self._name, mode)
348 else:
349 self._handle = handle
OSError: /opt/conda/bin/../lib/libgomp.so.1: version `GOMP_4.0' not found (required by /home/jovyan/.local/lib/python3.5/site-packages/xgboost-0.4-py3.5.egg/xgboost/libxgboost.so)
I have just noticed that this is a recurring pattern and at times can be quite limiting, but I don't understand compilers that well to know if this actually a problem with the image or if that's not really an 'issue', but rather a design decision. Any ideas?
Much appreciated!
Yeah, gomp stands for GNU OpenMP and as you have guessed is a gcc thing. Do you have either libgcc or gcc installed in your environment?
To check, go to Jupyter Notebook's Home and instead of creating a new notebook select New -> Terminal. From there you can run conda list. Please let me know the results.
Thanks!
I have this line
libgcc 4.8.5 1 r
But no gcc I also have glib.
But in terminal:
jovyan@aaaaaaa:~/work$ gcc --version
gcc (Debian 4.9.2-10) 4.9.2
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR
The complete list:
cairo 1.12.18 6 defaults
conda 4.0.5 py35_0 defaults
conda-env 2.4.5 py35_0 defaults
curl 7.45.0 0 defaults
cycler 0.10.0 py35_0 defaults
decorator 4.0.9 py35_0 defaults
fontconfig 2.11.1 5 defaults
freetype 2.5.5 0 defaults
glib 2.43.0 2 r
harfbuzz 0.9.39 0 defaults
ipykernel 4.3.1 py35_0 defaults
ipython 4.1.2 py35_0 defaults
ipython-genutils 0.1.0 <pip>
ipython_genutils 0.1.0 py35_0 defaults
ipywidgets 4.1.1 py35_0 defaults
jbig 2.1 0 defaults
jinja2 2.8 py35_0 defaults
jpeg 8d 0 defaults
jsonschema 2.4.0 py35_0 defaults
jupyter 1.0.0 py35_1 defaults
jupyter-client 4.2.2 <pip>
jupyter-console 4.1.1 <pip>
jupyter-core 4.1.0 <pip>
jupyter_client 4.2.2 py35_0 defaults
jupyter_console 4.1.1 py35_0 defaults
jupyter_core 4.1.0 py35_0 defaults
libffi 3.0.13 0 defaults
libgcc 4.8.5 1 r
libgfortran 3.0 0 defaults
libpng 1.6.17 0 defaults
libsodium 1.0.3 0 defaults
libtiff 4.0.6 1 defaults
libxml2 2.9.2 0 defaults
markupsafe 0.23 py35_0 defaults
matplotlib 1.5.1 np110py35_0 defaults
mistune 0.7.2 py35_0 defaults
mkl 11.3.1 0 defaults
nbconvert 4.1.0 py35_0 defaults
nbformat 4.0.1 py35_0 defaults
ncurses 5.9 4 r
notebook 4.1.0 py35_1 defaults
numpy 1.10.4 py35_1 defaults
openssl 1.0.2g 0 defaults
pandas 0.17.1 np110py35_0 defaults
pango 1.39.0 0 defaults
path.py 8.1.2 py35_1 defaults
pcre 8.31 0 defaults
pexpect 3.3 py35_0 defaults
pickleshare 0.5 py35_0 defaults
pip 8.1.1 py35_0 defaults
pixman 0.32.6 0 defaults
ptyprocess 0.5 py35_0 defaults
pycosat 0.6.1 py35_0 defaults
pycrypto 2.6.1 py35_0 defaults
pygments 2.1.1 py35_0 defaults
pyparsing 2.0.3 py35_0 defaults
pyqt 4.11.4 py35_1 defaults
python 3.5.1 0 defaults
python-dateutil 2.5.0 py35_0 defaults
pytz 2016.1 py35_0 defaults
pyyaml 3.11 py35_1 defaults
pyzmq 15.2.0 py35_0 defaults
qt 4.8.7 1 defaults
qtconsole 4.1.1 py35_1 defaults
r 3.2.2 0 r
r-base 3.2.2 0 r
r-base64enc 0.1_3 r3.2.2_0a r
r-bitops 1.0_6 r3.2.2_1a r
r-boot 1.3_17 r3.2.2_0a r
r-class 7.3_14 r3.2.2_0a r
r-cluster 2.0.3 r3.2.2_0a r
r-codetools 0.2_14 r3.2.2_0a r
r-colorspace 1.2_6 r3.2.2_0a r
r-dichromat 2.0_0 r3.2.2_2a r
r-digest 0.6.8 r3.2.2_2a r
r-evaluate 0.8 r3.2.2_0a r
r-foreign 0.8_66 r3.2.2_0a r
r-ggplot2 1.0.1 r3.2.2_0a r
r-gtable 0.1.2 r3.2.2_2a r
r-irdisplay 0.3 r3.2.2_0a r
r-irkernel 0.5 r3.2.2_1a r
r-jsonlite 0.9.17 r3.2.2_0a r
r-kernsmooth 2.23_15 r3.2.2_0a r
r-labeling 0.3 r3.2.2_1a r
r-lattice 0.20_33 r3.2.2_0a r
r-magrittr 1.5 r3.2.2_1a r
r-mass 7.3_45 r3.2.2_0a r
r-matrix 1.2_2 r3.2.2_0a r
r-mgcv 1.8_9 r3.2.2_0a r
r-munsell 0.4.2 r3.2.2_1a r
r-nlme 3.1_122 r3.2.2_0a r
r-nnet 7.3_11 r3.2.2_0a r
r-plyr 1.8.3 r3.2.2_0a r
r-proto 0.3_10 r3.2.2_1a r
r-rcolorbrewer 1.1_2 r3.2.2_2a r
r-rcpp 0.12.2 r3.2.2_0a r
r-rcurl 1.95_4.7 r3.2.2_0a r
r-recommended 3.2.2 r3.2.2_0 r
r-repr 0.3 r3.2.2_0a r
r-reshape2 1.4.1 r3.2.2_1a r
r-rpart 4.1_10 r3.2.2_0a r
r-rzmq 0.7.7 r3.2.2_3a r
r-scales 0.3.0 r3.2.2_0a r
r-spatial 7.3_11 r3.2.2_0a r
r-stringi 1.0_1 r3.2.2_0a r
r-stringr 1.0.0 r3.2.2_0a r
pytz 2016.1 py35_0 defaults
pyyaml 3.11 py35_1 defaults
pyzmq 15.2.0 py35_0 defaults
qt 4.8.7 1 defaults
qtconsole 4.1.1 py35_1 defaults
r 3.2.2 0 r
r-base 3.2.2 0 r
r-base64enc 0.1_3 r3.2.2_0a r
r-bitops 1.0_6 r3.2.2_1a r
r-boot 1.3_17 r3.2.2_0a r
r-class 7.3_14 r3.2.2_0a r
r-cluster 2.0.3 r3.2.2_0a r
r-codetools 0.2_14 r3.2.2_0a r
r-colorspace 1.2_6 r3.2.2_0a r
r-dichromat 2.0_0 r3.2.2_2a r
r-digest 0.6.8 r3.2.2_2a r
r-evaluate 0.8 r3.2.2_0a r
r-foreign 0.8_66 r3.2.2_0a r
r-ggplot2 1.0.1 r3.2.2_0a r
r-gtable 0.1.2 r3.2.2_2a r
r-irdisplay 0.3 r3.2.2_0a r
r-irkernel 0.5 r3.2.2_1a r
r-jsonlite 0.9.17 r3.2.2_0a r
r-kernsmooth 2.23_15 r3.2.2_0a r
r-labeling 0.3 r3.2.2_1a r
r-lattice 0.20_33 r3.2.2_0a r
r-magrittr 1.5 r3.2.2_1a r
r-mass 7.3_45 r3.2.2_0a r
r-matrix 1.2_2 r3.2.2_0a r
r-mgcv 1.8_9 r3.2.2_0a r
r-munsell 0.4.2 r3.2.2_1a r
r-nlme 3.1_122 r3.2.2_0a r
r-nnet 7.3_11 r3.2.2_0a r
r-plyr 1.8.3 r3.2.2_0a r
r-proto 0.3_10 r3.2.2_1a r
r-rcolorbrewer 1.1_2 r3.2.2_2a r
r-rcpp 0.12.2 r3.2.2_0a r
r-rcurl 1.95_4.7 r3.2.2_0a r
r-recommended 3.2.2 r3.2.2_0 r
r-repr 0.3 r3.2.2_0a r
r-reshape2 1.4.1 r3.2.2_1a r
r-rpart 4.1_10 r3.2.2_0a r
r-rzmq 0.7.7 r3.2.2_3a r
r-scales 0.3.0 r3.2.2_0a r
r-spatial 7.3_11 r3.2.2_0a r
r-stringi 1.0_1 r3.2.2_0a r
r-stringr 1.0.0 r3.2.2_0a r
In light of your suggestion, I tried
conda install -c https://conda.anaconda.org/anaconda gcc in the terminal
but to no avail.
Also when I tried to do install.packages('xgboost') (equivalent) in R, I got this error:
Error : .onLoad failed in loadNamespace() for 'xgboost', details:
call: dyn.load(file, DLLpath = DLLpath, ...)
error: unable to load shared object '/usr/local/spark-1.6.0-bin-hadoop2.6/R/lib/xgboost/libs/xgboost.so':
/usr/local/spark-1.6.0-bin-hadoop2.6/R/lib/xgboost/libs/xgboost.so: undefined symbol: GOMP_parallel
sudo apt-get install libgomp1 also did not help me as it was already installed.
sudo apt-get install libgomp1 also did not help me as it was already installed.
It looks like Python is looking for gomp in /opt/conda/lib and finding an incompatible version, and R is doing the same in the spark-1.6.0 dir and finding one that is also missing something.
Not sure how to force both to use the system lib path (or if that would help). Maybe setting LD_LIBRARY_PATH before notebook start to /usr/lib/gcc/x86_64-linux-gnu/4.9?
It is weird because the libgcc package is present. This should have libgomp. What version of gcc is available (if installed) in our images? Is it newer that 4.8.5?
The apt-get installed copy is 4.9. But there are also copies in conda (not sure what version) and SparkR (again, not sure what version) that seem to be getting picked up ahead of the Ubuntu version.
That is likely part of the problem. How was xgboost built, @jakubLangr?
So, I think one thing that might help is if we can xgboost packaged in some standard way.
This is hardly the first time I have seen someone struggle with some oddities related to this package and its not clear to me if it is even being built the same way each time. It would be nice if we could get everyone to use one canonical package for xgboost.
Would you, @jakubLangr, be will to try submitting it to conda-forge? This should ideally cut down on issues of this nature and it would make it easy to install. All the builds happen in very minimal VMs so there is little risk of contamination. You could continue to tweak the recipe after submission to your liking. Not to mention, the build process is completely automated so there is no need to spend cycles on a local machine doing this. I would be more than happy to help you through the process. Once we have an acceptable recipe, builds and releases would occur immediately. Please let me know if this is something you would be interested in.
@jakirkham I hit the original problem simply by doing pip install xgboost. It seems to look in the lib path in the same prefix as the Python package install location, which in the stacks case is the /opt/conda directory. So it finds the incompatible libgomp from conda.
Thanks for investigating, @parente. I expect if one installed Continuum's gcc package into the conda environment before running pip install xgboost. This problem would go away.
Though conda is designed to avoid these annoyances with pip and compilers by just doing a build step that can be dealt with in a clean separate environment from the install. It really seems the OP could benefit from a simple recipe here. Given pip install works this recipe will be quite manageable and already be very useful.
I expect if one installed Continuum's gcc package into the conda environment before running pip install xgboost. This problem would go away.
Indeed @jakirkham is right:
%bash
conda install -y gcc
pip install xgboost
import xgboost
I put this workaround on the recipes page for the time being: https://github.com/jupyter/docker-stacks/wiki/Docker-recipes#use-xgboost
Let's leave this issue open for a bit so @jakubLangr can respond about the conda forge recipe and/or this workaround.
Hey @parente 馃憤 that worked for xgboost (thanks a lot!), but unfortunately not for lightfm. (Also really sorry for the late reply!)
I understand that is not such a massive deal for you guys as that it is not an extremely popular plug-in and I can probably find a workaround, but it would be probably good to record this for posterity.
You can still replicate this by starting a new all-pyspark docker image and then sshing into the machine (as root):
conda install -y gcc
pip install lightfm
python
Python 3.4.4 |Continuum Analytics, Inc.| (default, Jan 11 2016, 13:54:01)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import lightfm
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/conda/lib/python3.4/site-packages/lightfm/__init__.py", line 1, in <module>
from .lightfm import LightFM
File "/opt/conda/lib/python3.4/site-packages/lightfm/lightfm.py", line 7, in <module>
from .lightfm_fast import (CSRMatrix, FastLightFM,
ImportError: /opt/conda/lib/python3.4/site-packages/lightfm/lightfm_fast.cpython-34m.so: undefined symbol: GOMP_parallel
I understand that ideally lightfm would be conda installable, but unfortunately this is not available on the anaconda website.
Any thoughts on this?
Thanks!
I understand that ideally lightfm would be conda installable, but unfortunately this is not available on the anaconda website.
Any thoughts on this?
Again @jakubLangr, I think the right path forward is to write conda recipes for these difficult to pip install packages and contribute them to conda-forge. Our express mission over there is to help people package things that Anaconda doesn't provide. We also hope to stay more current because we have a large community that is actively engaged and infrastructure to support that operation.
That being said, one needs to take ownership of recipes they contribute. We can help you get started, but ultimately it will be on you to maintain the things you add. Though we do provide you the infrastructure to build and deploy on any platform you choose to support. That way you will be able to run conda install -c conda-forge <xyz package>.
As awesome as the images over here are, they are not targeted to be a build environment. They are targeted at installing a basic stack. Anything else you want you can resolve on your own by using conda for binaries, pip for other things (ideally pure Python), and starting up a notebook for one to interact with all this neat stuff. We (at conda-forge) have a docker image that is targeted to be a build environment that we use for that purpose. It is designed for good compatibility so things built there will work here and most other Linux environments too (unless they are particularly ancient).
To get the ball rolling, I would suggest starting by reading the aforementioned docs and maybe writing a recipe for something simple like a Python package and submitting that to the staged-recipes repo as described. Once you have more of a feel for the system, we can try repackaging this recipe for xgboost. There seem to be a fair number of people that are interested in and using that recipe so we can try to ask them for help too. We can then figure out where to go from there.
I'm going to close this issue since I agree with the solution recommended by @jakirkham. An xgboost conda-forge recipe would make installation much easier.
Just as a follow-up on this, we are adding xgboost to conda-forge. So the best solution is probably just to install it from there if you need it.