I downloaded the tarball for xgboost-0.80 and followed the instructions for building for R with GPU support: https://xgboost.readthedocs.io/en/latest/build.html#installing-r-package-with-gpu-support
The cmake
step fails with
-- Setting build type to 'Release' as none was specified.
CMake Error at CMakeLists.txt:67 (add_subdirectory):
The source directory
[PATH]/xgboost-0.80/dmlc-core
does not contain a CMakeLists.txt file.
However, if I clone the github repo, I don't have that failure.
Did you get the tarball from the release section? Looks like the tarball didn't contain git submodule.
Yes, I got the tarball for release 0.80 and also tried 0.72 from the github release section.
The tarballs from GitHub release section doesn't contain the submodules (e.g. dmlc-core). We should probably document this fact. For now, you should use
git clone --recursive https://github.com/dmlc/xgboost -b release_0.80
Is there a different source for the tarballs with the necessary requirements? Or does building with GPU support require cloning the repo? I would prefer to use a numbered release for stability and reproduceability.
On that note, is there any intention or possibility to build GPU support directly in the xgboost package on CRAN?
I would prefer to use a numbered release for stability and reproduceability.
We use tags and branches to denote releases. So check out v0.80
tag to fix the commit to the time of 0.80 release. The release_0.80
branch is similar, but this one will contain a few backported changes.
To checkout the git repo with tag, run
git clone --recursive https://github.com/dmlc/xgboost -b v0.80
is there any intention or possibility to build GPU support directly in the xgboost package on CRAN?
Unfortunately, no. CRAN has strict requirement for multi-platform support, so shipping GPU native code in a R package is quite challenging. Currently, CRAN hosts a CPU-only xgboost package.
To build stuffs from tarball, dmlc-core needs to be properly released and xgboost needs to strict itself with the dmlc-core release scheme. I patched xgboost to build all dependencies separately at v0.71 (with Python and without CUDA). But that still require cloning dmlc-core.
https://github.com/trivialfis/guixpkgs
SInce XGBoost doesn't seem to have any desire to go into traditional distributions, so didn't make the effort to merge it here.
dmlc-core needs to be properly released and xgboost needs to strict itself with the dmlc-core release scheme
@trivialfis Can you clarify this a bit? Do you mean that each release of XGBoost should be aligned with a particular release of dmlc-core? That would mean that each time XGBoost master gets an updated dmlc-core, there needs to be another release of dmlc-core. I think it may be easier to simply upload tarballs manually to Releases section. What I need to do is to simply checkout the git repo with --recursive
option, remove .git/
directory, and then zip up the rest.
@hcho3 Surely you are right. It's just packagers from distributions frown upon bundled dependencies. Your idea is perfect enough for other users.
I uploaded the tarballs for versions 0.71, 0.72, and 0.80. I plan to upload tarballs for future releases as well, since I do so anyway for PyPI.
@trivialfis Got it. Has there been a motion for XGBoost to be distributed in "traditional distributions" (I'm assuming you mean Homebrew, APT etc.)?
@hcho3 Sorry I don't know what others think, I don't have fellows who share my interest in related area. But I don't think it's necessary to do it. From what I can tell, many data scientists just import whatever come to mind in Python to glue a hundred lines script.
If such a requirement exists, I think you would have better sense than me since I am just toying with everything rather than "deploying".
@trivialfis I concur with your impression. I don't think I'm any qualified to say anything about traditional distributions, but I think for many users distributing from PyPI meets their need (especially with the recent addition of binary wheels).
@hcho3 Thanks, the tagged git repo and the new tarball you uploaded both work and cmake succeeds.
Note that I used cmake .. -DUSE_CUDA=ON -DR_LIB=ON -DCMAKE_C_COMPILER=cuda-gcc -DCMAKE_CXX_COMPILER=cuda-g++
to specify the gcc
and g++
versions (6.4.0) suitable for CUDA v9.1.85, both installed from the negativo repo on Fedora 28 (which by default has gcc version 8). Editing the lines for export CC
and export CXX
in xgboost/Makefile and xgboost/make/config.mk didn't work for me, and cmake would fail due to using version 8.
Now make install -j
succeeds in building xgboost but fails in installing the project:
[...]
[100%] Built target xgboost
Install the project...
-- Install configuration: "Release"
-- Installing: [PATH]/xgboost/build/R-package/src/xgboost.so
-- Set runtime path of "[PATH]/xgboost/build/R-package/src/xgboost.so" to ""
> deps = setdiff(c('data.table', 'magrittr', 'stringi'), rownames(installed.packages())); if(length(deps)>0) install.packages(deps, repo = 'https://cloud.r-project.org/')
>
>
Warning: invalid package ‘/[PATH]/xgboost/build/R-package’
Error: ERROR: no packages specified
Am I missing something?
@adatum Can you try running R CMD INSTALL .
inside /[PATH]/xgboost/build/R-package
?
$ R CMD INSTALL .
Warning: invalid package ‘.’
Error: ERROR: no packages specified
What should the R-package directory contain?
$ ls -R
.:
src
./src:
Makevars Makevars.win xgboost.so
It's weird how build/R-package
is somehow empty. For now, copy libxgboost.so
to /[PATH]/xgboost/R-package/src
and run R CMD INSTALL .
inside /[PATH]/xgboost/R-package
Did you mean to copy xgboost.so
from xgboost/build/R-package/src
to xgboost/R-package/src
?
This worked to get the R package installed from xgboost/R-package
with R CMD INSTALL .
as you said.
However when I tried the gpu_accelerated
demo I got an error:
Error in xgb.iter.update(bst$handle, dtrain, iteration - 1, obj) :
[01:33:22] ../..//amalgamation/../src/learner.cc:152: XGBoost version not compiled with GPU support.
Are there files in the xgboost/build
directories that contain resources compiled with GPU support that also need to be copied over?
In your xgboost
directory, can you find all *.so
files?
Also, can you look at the compilation log to see if nvcc
was invoked?
$ find . -name '*.so'
./R-package/src/xgboost.so
./build/R-package/src/xgboost.so
./build/xgboost.so
Which log file would you like me to search? I searched in CMakeOutput.log
and found no mention of nvcc
.
cmake previously could not find GTest, but after installing gtest
and gtest-devel
that warning went away and cmake no longer generates a CMakeError.log
file.
Run make clean
and then make -j VERBOSE=1
. This will show all compilation flags used
With VERBOSE=1
it shows that /usr/bin/nvcc
was called several times, and the flag -DNVCC
is used.
Are the contents of xgboost/build/install_manifest.txt
as expected?
[PATH]/xgboost/build/R-package/src/xgboost.so
[PATH]/xgboost/build/dummy_inst/lib/libdmlc.a
[PATH]/xgboost/build/dummy_inst/./include/dmlc/common.h
[PATH]/xgboost/build/dummy_inst/./include/dmlc/thread_local.h
[PATH]/xgboost/build/dummy_inst/./include/dmlc/optional.h
[PATH]/xgboost/build/dummy_inst/./include/dmlc/config.h
[PATH]/xgboost/build/dummy_inst/./include/dmlc/serializer.h
[PATH]/xgboost/build/dummy_inst/./include/dmlc/any.h
[PATH]/xgboost/build/dummy_inst/./include/dmlc/json.h
[PATH]/xgboost/build/dummy_inst/./include/dmlc/thread_group.h
[PATH]/xgboost/build/dummy_inst/./include/dmlc/timer.h
[PATH]/xgboost/build/dummy_inst/./include/dmlc/endian.h
[PATH]/xgboost/build/dummy_inst/./include/dmlc/io.h
[PATH]/xgboost/build/dummy_inst/./include/dmlc/array_view.h
[PATH]/xgboost/build/dummy_inst/./include/dmlc/parameter.h
[PATH]/xgboost/build/dummy_inst/./include/dmlc/registry.h
[PATH]/xgboost/build/dummy_inst/./include/dmlc/data.h
[PATH]/xgboost/build/dummy_inst/./include/dmlc/concurrency.h
[PATH]/xgboost/build/dummy_inst/./include/dmlc/blockingconcurrentqueue.h
[PATH]/xgboost/build/dummy_inst/./include/dmlc/recordio.h
[PATH]/xgboost/build/dummy_inst/./include/dmlc/base.h
[PATH]/xgboost/build/dummy_inst/./include/dmlc/logging.h
[PATH]/xgboost/build/dummy_inst/./include/dmlc/type_traits.h
[PATH]/xgboost/build/dummy_inst/./include/dmlc/concurrentqueue.h
[PATH]/xgboost/build/dummy_inst/./include/dmlc/memory.h
[PATH]/xgboost/build/dummy_inst/./include/dmlc/omp.h
[PATH]/xgboost/build/dummy_inst/./include/dmlc/lua.h
[PATH]/xgboost/build/dummy_inst/./include/dmlc/input_split_shuffle.h
[PATH]/xgboost/build/dummy_inst/./include/dmlc/memory_io.h
[PATH]/xgboost/build/dummy_inst/./include/dmlc/threadediter.h
[PATH]/xgboost/build/dummy_inst/./doc/README
[PATH]/xgboost/build/dummy_inst/./doc/Doxyfile
[PATH]/xgboost/build/dummy_inst/./doc/Makefile
[PATH]/xgboost/build/dummy_inst/./doc/sphinx_util.py
[PATH]/xgboost/build/dummy_inst/./doc/.gitignore
[PATH]/xgboost/build/dummy_inst/./doc/parameter.md
[PATH]/xgboost/build/dummy_inst/./doc/conf.py
[PATH]/xgboost/build/dummy_inst/./doc/index.md
Can you post the full log of make -j VERBOSE=1
?
Hope it helps.
make -j VERBOSE=1
output: https://gist.github.com/adatum/a9bcbb928710f5ca4fd227b395b6a938
CMakeOutput.log: https://gist.github.com/adatum/e3c9ec2c3c03bb2e12dbc103710ef624
Looks like xgboost.so
should have all GPU objects linked in. (Notice that macro XGBOOST_USE_CUDA
is being used. This is used to check GPU support.) Can you run nm -gC xgboost.so
and post the output? This should list all compiled symbols.
Output of nm -gC xgboost.so
: https://gist.github.com/adatum/43a50b939a5cdfbab9adf63070d01bc8
So xgboost.so
indeed contains all GPU code. Let's double-check XGBoost package installation. Can you locate where the XGBoost R package has been installed? Check if the xgboost.so
file there is the same as the one found in [PATH]/xgboost/build/R-package/src/xgboost.so
.
Thank you! Inexplicably, the xgboost.so
in the R package installation was a different (smaller) size than the one in xgboost/build/src/xgboost.so
with GPU support. With the larger (38.7 MB) file copied, the gpu_accelerated
demo works.
Perhaps I incorrectly copied the file, though I'm puzzled at how that could be. In any case, most of the troubles here seem to be due to the R package building/installation process.
Thanks again so much for patiently debugging the issue.
@adatum Yeah, I don't know why the R installation process was giving you so much headache. Glad to hear that you got it to work finally. In the future, I'm hoping to find a way to make installation less painful.