Lightgbm: [R-package] Error installing R package with local pre-complie option in `build_r.R`

Created on 6 Sep 2020  Â·  7Comments  Â·  Source: microsoft/LightGBM

Hi,

I'm having problem install lightgbm for R after I complied from source with git master branch.

My locally complied version works for my python install. But struggling with R. It seems to not be able to find lib_lightgbm.so, which I clearly have in ~/git/LightGBM/lib_lightgbm.so.

I followed the instruction here. Specifically, I followed this instruction:

"If you are using a precompiled dll/lib locally, you can move the dll/lib into LightGBM root folder, modify LightGBM/R-package/src/install.libs.R's 2nd line (change use_precompile <- FALSE to use_precompile <- TRUE), and install R-package as usual."

What exact files do I need to copy for Linux? ~/git/LightGBM/lib_lightgbm.so? And to where?

I attached the console output after I ran Rscript build_r.R below.

THANKS!

How you are using LightGBM?

  • R package
  • Python package

Environment info

Operating System: Ubnutu 20.04

CPU/GPU model: Nvidia GeForece 1080Ti

C++ compiler version: GCC 9.3

CMake version: 3.16.3

R version: 4.0.2

Other:

LightGBM version or commit hash: git master branch

Error message and / or logs

>Rscript build_r.R

 checking for file ‘/home/abc/git/LightGBM/lightgbm_r/DESCRIPTION’ ... OK
* preparing ‘lightgbm’:
* checking DESCRIPTION meta-information ... OK
* cleaning src
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
WARNING: directory ‘lightgbm/src/compute/test’ is empty
* looking to see if a ‘data/datalist’ file should be added
* building ‘lightgbm_3.0.0.99.tar.gz’

* installing to library ‘/home/abc/R/x86_64-pc-linux-gnu-library/4.0’
* installing *source* package ‘lightgbm’ ...
** using staged installation
** libs
installing via 'install.libs.R' to /home/abc/R/x86_64-pc-linux-gnu-library/4.0/00LOCK-lightgbm/00new/lightgbm

Error in eval(ei, envir) : Cannot find lib_lightgbm.so

* removing ‘/home/abc/R/x86_64-pc-linux-gnu-library/4.0/lightgbm’
Error in .run_shell_command(install_cmd, install_args) : 
  Command failed with exit code: 1
Execution halted
r-package

All 7 comments

hello @esvhd , thanks for using LightGBM! We have been making a lot of changes to the R package recently and I think that installing the R package together with a precompiled lib_lightgbm.so is in a weird state.

Is there a strong reason that you need to use a precompiled binary? For example, did you build it with some customization?

If not, could you please try building from source? You can just run Rscript build_r.R.

If that doesn't work, you might have more luck with the CRAN package.

sh build-cran-package.sh
R CMD INSTALL lightgbm_3.0.0.tar.gz

Thanks for the prompt reply!

I built the package from source to enable MPI and GPU support. I read in the R install guide and found how to turn on GPU, but nothing on MPI? Is there anything I need to do there?

Thanks!

Sent from my iPhone

On 6 Sep 2020, at 02:29, James Lamb notifications@github.com wrote:


hello @esvhd , thanks for using LightGBM! We have been making a lot of changes to the R package recently and I think that installing the R package together with a precompiled lib_lightgbm.so is in a weird state.

Is there a strong reason that you need to use a precompiled binary? For example, did you build it with some customization?

If not, could you please try building from source? You can just run Rscript build_r.R.

If that doesn't work, you might have more luck with the CRAN package.

sh build-cran-package.sh
R CMD INSTALL lightgbm_3.0.0.tar.gz
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.

@esvhd ah, maybe that is a gap in our documentation, apologies!

Can you replace R-package/src/install.libs.R with this?

new install.libs.R (click me)

# User options
use_precompile <- FALSE
use_gpu <- TRUE

# For Windows, the package will be built with Visual Studio
# unless you set one of these to TRUE
use_mingw <- FALSE
use_msys2 <- FALSE

if (use_mingw && use_msys2) {
  stop("Cannot use both MinGW and MSYS2. Please choose only one.")
}

if (.Machine$sizeof.pointer != 8L) {
  stop("LightGBM only supports 64-bit R, please check the version of R and Rtools.")
}

R_int_UUID <- .Internal(internalsID())
R_ver <- as.double(R.Version()$major) + as.double(R.Version()$minor) / 10.0

if (!(R_int_UUID == "0310d4b8-ccb1-4bb8-ba94-d36a55f60262"
    || R_int_UUID == "2fdf6c18-697a-4ba7-b8ef-11c0d92f1327")) {
  warning("Warning: unmatched R_INTERNALS_UUID, may not run normally.")
}

# system() will not raise an R exception if the process called
# fails. Wrapping it here to get that behavior.
#
# system() introduces a lot of overhead, at least on Windows,
# so trying processx if it is available
.run_shell_command <- function(cmd, args, strict = TRUE) {
    on_windows <- .Platform$OS.type == "windows"
    has_processx <- suppressMessages({
      suppressWarnings({
        require("processx")  # nolint
      })
    })
    if (has_processx && on_windows) {
      result <- processx::run(
        command = cmd
        , args = args
        , windows_verbatim_args = TRUE
        , error_on_status = FALSE
        , echo = TRUE
      )
      exit_code <- result$status
    } else {
      if (on_windows) {
        message(paste0(
          "Using system() to run shell commands. Installing "
          , "'processx' with install.packages('processx') might "
          , "make this faster."
        ))
      }
      cmd <- paste0(cmd, " ", paste0(args, collapse = " "))
      exit_code <- system(cmd)
    }

    if (exit_code != 0L && isTRUE(strict)) {
        stop(paste0("Command failed with exit code: ", exit_code))
    }
    return(invisible(exit_code))
}

# try to generate Visual Studio build files
.generate_vs_makefiles <- function(cmake_args) {
  vs_versions <- c(
    "Visual Studio 16 2019"
    , "Visual Studio 15 2017"
    , "Visual Studio 14 2015"
  )
  working_vs_version <- NULL
  for (vs_version in vs_versions) {
    message(sprintf("Trying '%s'", vs_version))
    # if the build directory is not empty, clean it
    if (file.exists("CMakeCache.txt")) {
      file.remove("CMakeCache.txt")
    }
    vs_cmake_args <- c(
      cmake_args
      , "-G"
      , shQuote(vs_version)
      , "-A"
      , "x64"
    )
    exit_code <- .run_shell_command("cmake", c(vs_cmake_args, ".."), strict = FALSE)
    if (exit_code == 0L) {
      message(sprintf("Successfully created build files for '%s'", vs_version))
      return(invisible(TRUE))
    }

  }
  return(invisible(FALSE))
}

# Move in CMakeLists.txt
write_succeeded <- file.copy(
  "../inst/bin/CMakeLists.txt"
  , "CMakeLists.txt"
  , overwrite = TRUE
)
if (!write_succeeded) {
  stop("Copying CMakeLists.txt failed")
}

# Get some paths
source_dir <- file.path(R_PACKAGE_SOURCE, "src", fsep = "/")
build_dir <- file.path(source_dir, "build", fsep = "/")

# Check for precompilation
if (!use_precompile) {

  # Prepare building package
  dir.create(
    build_dir
    , recursive = TRUE
    , showWarnings = FALSE
  )
  setwd(build_dir)

  use_visual_studio <- !(use_mingw || use_msys2)

  # If using MSVC to build, pull in the script used
  # to create R.def from R.dll
  if (WINDOWS && use_visual_studio) {
    write_succeeded <- file.copy(
      "../../inst/make-r-def.R"
      , file.path(build_dir, "make-r-def.R")
      , overwrite = TRUE
    )
    if (!write_succeeded) {
      stop("Copying make-r-def.R failed")
    }
  }

  # Prepare installation steps
  cmake_args <- "-DUSE_MPI=ON"
  build_cmd <- "make"
  build_args <- "_lightgbm"
  lib_folder <- file.path(source_dir, fsep = "/")

  WINDOWS_BUILD_TOOLS <- list(
    "MinGW" = c(
      build_tool = "mingw32-make.exe"
      , makefile_generator = "MinGW Makefiles"
    )
    , "MSYS2" = c(
      build_tool = "make.exe"
      , makefile_generator = "MSYS Makefiles"
    )
  )

  if (use_mingw) {
    windows_toolchain <- "MinGW"
  } else if (use_msys2) {
    windows_toolchain <- "MSYS2"
  } else {
    # Rtools 4.0 moved from MinGW to MSYS toolchain. If user tries
    # Visual Studio install but that fails, fall back to the toolchain
    # supported in Rtools
    if (R_ver >= 4.0) {
      windows_toolchain <- "MSYS2"
    } else {
      windows_toolchain <- "MinGW"
    }
  }
  windows_build_tool <- WINDOWS_BUILD_TOOLS[[windows_toolchain]][["build_tool"]]
  windows_makefile_generator <- WINDOWS_BUILD_TOOLS[[windows_toolchain]][["makefile_generator"]]

  if (use_gpu) {
    cmake_args <- c(cmake_args, "-DUSE_GPU=ON")
  }
  cmake_args <- c(cmake_args, "-DBUILD_FOR_R=ON")

  # Pass in R version, used to help find R executable for linking
  R_version_string <- paste(
    R.Version()[["major"]]
    , R.Version()[["minor"]]
    , sep = "."
  )
  r_version_arg <- sprintf("-DCMAKE_R_VERSION='%s'", R_version_string)
  cmake_args <- c(cmake_args, r_version_arg)

  # the checks below might already run `cmake -G`. If they do, set this flag
  # to TRUE to avoid re-running it later
  makefiles_already_generated <- FALSE

  # Check if Windows installation (for gcc vs Visual Studio)
  if (WINDOWS) {
    if (!use_visual_studio) {
      message(sprintf("Trying to build with %s", windows_toolchain))
      # Must build twice for Windows due sh.exe in Rtools
      cmake_args <- c(cmake_args, "-G", shQuote(windows_makefile_generator))
      .run_shell_command("cmake", c(cmake_args, ".."), strict = FALSE)
      build_cmd <- windows_build_tool
      build_args <- "_lightgbm"
    } else {
      visual_studio_succeeded <- .generate_vs_makefiles(cmake_args)
      if (!isTRUE(visual_studio_succeeded)) {
        warning(sprintf("Building with Visual Studio failed. Attempting with %s", windows_toolchain))
        # Must build twice for Windows due sh.exe in Rtools
        cmake_args <- c(cmake_args, "-G", shQuote(windows_makefile_generator))
        .run_shell_command("cmake", c(cmake_args, ".."), strict = FALSE)
        build_cmd <- windows_build_tool
        build_args <- "_lightgbm"
      } else {
        build_cmd <- "cmake"
        build_args <- c("--build", ".", "--target", "_lightgbm", "--config", "Release")
        lib_folder <- file.path(source_dir, "Release", fsep = "/")
        makefiles_already_generated <- TRUE
      }
    }
  } else {
      .run_shell_command("cmake", c(cmake_args, ".."))
      makefiles_already_generated <- TRUE
  }

  # generate build files
  if (!makefiles_already_generated) {
    .run_shell_command("cmake", c(cmake_args, ".."))
  }

  # R CMD check complains about the .NOTPARALLEL directive created in the cmake
  # Makefile. We don't need it here anyway since targets are built serially, so trying
  # to remove it with this hack
  generated_makefile <- file.path(
    build_dir
    , "Makefile"
  )
  if (file.exists(generated_makefile)) {
    makefile_txt <- readLines(
      con = generated_makefile
    )
    makefile_txt <- gsub(
      pattern = ".*NOTPARALLEL.*"
      , replacement = ""
      , x = makefile_txt
    )
    writeLines(
      text = makefile_txt
      , con = generated_makefile
      , sep = "\n"
    )
  }

  # build the library
  message("Building lib_lightgbm")
  .run_shell_command(build_cmd, build_args)
  src <- file.path(lib_folder, paste0("lib_lightgbm", SHLIB_EXT), fsep = "/")

} else {

  # Has precompiled package
  lib_folder <- file.path(R_PACKAGE_SOURCE, "../", fsep = "/")
  shared_object_file <- file.path(
    lib_folder
    , paste0("lib_lightgbm", SHLIB_EXT)
    , fsep = "/"
  )
  release_file <- file.path(
    lib_folder
    , paste0("Release/lib_lightgbm", SHLIB_EXT)
    , fsep = "/"
  )
  windows_shared_object_file <- file.path(
    lib_folder
    , paste0("/windows/x64/DLL/lib_lightgbm", SHLIB_EXT)
    , fsep = "/"
  )
  if (file.exists(shared_object_file)) {
    src <- shared_object_file
  } else if (file.exists(release_file)) {
    src <- release_file
  } else {
    # Expected result: installation will fail if it is not here or any other
    src <- windows_shared_object_file
  }
}

# Packages with install.libs.R need to copy some artifacts into the
# expected places in the package structure.
# see https://cran.r-project.org/doc/manuals/r-devel/R-exts.html#Package-subdirectories,
# especially the paragraph on install.libs.R
dest <- file.path(R_PACKAGE_DIR, paste0("libs", R_ARCH), fsep = "/")
dir.create(dest, recursive = TRUE, showWarnings = FALSE)
if (file.exists(src)) {
  message(paste0("Found library file: ", src, " to move to ", dest))
  file.copy(src, dest, overwrite = TRUE)

  symbols_file <- file.path(source_dir, "symbols.rds")
  if (file.exists(symbols_file)) {
    file.copy(symbols_file, dest, overwrite = TRUE)
  }

} else {
  stop(paste0("Cannot find lib_lightgbm", SHLIB_EXT))
}

# clean up the "build" directory
if (dir.exists(build_dir)) {
  message("Removing 'build/' directory")
  unlink(
    x = build_dir
    , recursive = TRUE
    , force = TRUE
  )
}

Then if you use Rscript build_r.R, I believe it will build a GPU-enabled version of the package with MPI support. This isn't a combination we've tested with the R package before, so please let me know what happens.

Could you do me a favor and open an issue called "[R-package] allow use of MPI for distributed training" and explain there why you have a strong preference for using MPI vs. socket-based distributed training? Then I can work on making this easier.

As part of #2441 , I'm going to simplify the CMake-based installation of the R package so you'll be able to do something like Rscript build_r.R --use-gpu --use-mpi.

Hi @jameslamb Impressed by the helpful reply!

I tried this, looks like it was complaining about not finding OpenCL.

I tried adding these flags in my ~/.R/Makvars but it did not work either.

OpenCL_LIBRARY=/usr/local/cuda/lib64/libOpenCL.so
OpenCL_INCLUDE_DIR=/usr/local/cuda/include/

When I complied the python package I had to add the following flags for cmake:

cmake -DUSE_MPI=ON -DUSE_GPU=1 -DOpenCL_LIBRARY=/usr/local/cuda/lib64/libOpenCL.so -DOpenCL_INCLUDE_DIR=/usr/local/cuda/include/ ..

Full output from build_r.R:

(base) abc:~/git/LightGBM$ Rscript build_r.R 
* checking for file ‘/home/abc/git/LightGBM/lightgbm_r/DESCRIPTION’ ... OK
* preparing ‘lightgbm’:
* checking DESCRIPTION meta-information ... OK
* cleaning src
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
WARNING: directory ‘lightgbm/src/compute/test’ is empty
* looking to see if a ‘data/datalist’ file should be added
* building ‘lightgbm_3.0.0.99.tar.gz’

[1] "current tarball dir: "
[1] "/home/abc/git/LightGBM"
* installing to library ‘/home/abc/R/x86_64-pc-linux-gnu-library/4.0’
* installing *source* package ‘lightgbm’ ...
** using staged installation
** libs
installing via 'install.libs.R' to /home/abc/R/x86_64-pc-linux-gnu-library/4.0/00LOCK-lightgbm/00new/lightgbm
-- The C compiler identification is GNU 9.3.0
-- The CXX compiler identification is GNU 9.3.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- R version passed into FindLibR.cmake: 4.0.2
-- Found LibR: /usr/lib/R  
-- LIBR_EXECUTABLE: /usr/bin/R
-- LIBR_INCLUDE_DIRS: /usr/share/R/include
-- LIBR_CORE_LIBRARY: /usr/lib/R/lib/libR.so
-- Found MPI_C: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so (found version "3.1") 
-- Found MPI_CXX: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi_cxx.so (found version "3.1") 
-- Found MPI: TRUE (found version "3.1")  
-- MPI libraries: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi_cxx.so/usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so
-- MPI C++ libraries: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi_cxx.so/usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so
-- Found OpenMP_C: -fopenmp (found version "4.5") 
-- Found OpenMP_CXX: -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
-- Looking for CL_VERSION_2_2
-- Looking for CL_VERSION_2_2 - not found
-- Looking for CL_VERSION_2_1
-- Looking for CL_VERSION_2_1 - not found
-- Looking for CL_VERSION_2_0
-- Looking for CL_VERSION_2_0 - not found
-- Looking for CL_VERSION_1_2
-- Looking for CL_VERSION_1_2 - not found
-- Looking for CL_VERSION_1_1
-- Looking for CL_VERSION_1_1 - not found
-- Looking for CL_VERSION_1_0
-- Looking for CL_VERSION_1_0 - not found
CMake Error at /usr/share/cmake-3.16/Modules/FindPackageHandleStandardArgs.cmake:146 (message):
  Could NOT find OpenCL (missing: OpenCL_INCLUDE_DIR)
Call Stack (most recent call first):
  /usr/share/cmake-3.16/Modules/FindPackageHandleStandardArgs.cmake:393 (_FPHSA_FAILURE_MESSAGE)
  /usr/share/cmake-3.16/Modules/FindOpenCL.cmake:150 (find_package_handle_standard_args)
  CMakeLists.txt:111 (find_package)


-- Configuring incomplete, errors occurred!
See also "/tmp/RtmpA2qnFd/R.INSTALL14b751f4fcf/lightgbm/src/build/CMakeFiles/CMakeOutput.log".
See also "/tmp/RtmpA2qnFd/R.INSTALL14b751f4fcf/lightgbm/src/build/CMakeFiles/CMakeError.log".
Error in .run_shell_command("cmake", c(cmake_args, "..")) : 
  Command failed with exit code: 1
* removing ‘/home/abc/R/x86_64-pc-linux-gnu-library/4.0/lightgbm’
Error in .run_shell_command(install_cmd, install_args) : 
  Command failed with exit code: 1
Execution halted

Also have to admit, enable MPI at this stage is for my learning, but happy to open an issue anyway to kick this off, I'm sure there will be more capable minds out there who can chip in.

Appreciate your help.

hmmm ok, interesting! Sorry, I know very little about OpenCL (my main role here is maintaining the R package) so I can only help so far.

My answer got long as I started typing, so I broke it up

Things to try

  1. Symlink or copy those OpenCL libraries into a common place on the search path of CMake's find_package().
  2. Add the customizations from your cmake command to cmake_args in R-package/src/install.libs.R. For example:
cmake_args <- c(
    "-DOpenCL_LIBRARY=/usr/local/cuda/lib64/libOpenCL.so"
    , " -DOpenCL_INCLUDE_DIR=/usr/local/cuda/include/"
)

Background on why things work this way in {lightgbm}

I know for sure that the cmake command you mentioned is missing an important flag for the R package: -DBUILD_FOR_R=ON

https://github.com/microsoft/LightGBM/blob/afc76d2cb8234f6876ed75d923a7916bfef9a1e5/CMakeLists.txt#L17

There are parts of the library that are handled differently for the R package. For example, we use R's built-in logger instead of the *printf family of C functions.

I don't believe ~/.R/Makevars will have an effect on how the R package is built, since we use install.libs.R. There are a lot more details in https://cran.r-project.org/doc/manuals/r-release/R-exts.html, but basically R packages calling C++ code can do one of two options:

  1. template a Makefile and fill it in at build time with a configure script
  2. run R code in a src/install.libs.R

When you use Rscript build_r.R (our hook into CMake from R), you are using option 2. I don't believe ~/.R/Makevars will impact the choices that that path makes about which include dirs to use.

Hope that helps! Let me know what happens and we can debug further.

Amazing, glad to report option 2 you suggested seems to have worked!

I replaced line 137 in your build_r.R to the below, then the build script worked fine and lightgbm package is now installed for my R 4.0.2.

 cmake_args <- c("-DUSE_MPI=ON", " -DOpenCL_LIBRARY=/usr/local/cuda/lib64/libOpenCL.so", " -DOpenCL_INCLUDE_DIR=/usr/local/cuda/include/")

Thanks @jameslamb !!

Great! And thanks for opening #3364.

I'll close this issue since things are working for you now. Come back any time!

Was this page helpful?
0 / 5 - 0 ratings