Xgboost: [RFC] 1.2.0 Release

Created on 2 Aug 2020  路  24Comments  路  Source: dmlc/xgboost

Roadmap: #5734

We are about to release version 1.2.0 of XGBoost. In the next two weeks, we invite everyone to try out the release candidate (RC).

Feedback period: until the end of August 21, 2020. No new feature will be added to the release; only critical bug fixes will be added.

@dmlc/xgboost-committer

Now available

  • Python package. RC2 available on PyPI. Try it out with the command
python3 -m pip install xgboost==1.2.0rc2
  • R package. RC2 available from the Releases section. Download the tarball file xgboost_1.2.0.1.tar.gz and run
R CMD INSTALL xgboost_1.2.0.1.tar.gz

Rendered R manual

  • JVM packages. RC2 available from our Maven repository. Add XGBoost4J as dependency to your Java application.

Show instructions (Maven/SBT)

Maven

<dependencies>
  ...
  <dependency>
      <groupId>ml.dmlc</groupId>
      <artifactId>xgboost4j_2.12</artifactId>
      <version>1.2.0-RC2</version>
  </dependency>
  <dependency>
      <groupId>ml.dmlc</groupId>
      <artifactId>xgboost4j-spark_2.12</artifactId>
      <version>1.2.0-RC2</version>
  </dependency>
</dependencies>

<repositories>
  <repository>
    <id>XGBoost4J Release Repo</id>
    <name>XGBoost4J Release Repo</name>
    <url>https://s3-us-west-2.amazonaws.com/xgboost-maven-repo/release/</url>
  </repository>
</repositories>
SBT
libraryDependencies ++= Seq(
  "ml.dmlc" %% "xgboost4j" % "1.2.0-RC2",
  "ml.dmlc" %% "xgboost4j-spark" % "1.2.0-RC2"
)
resolvers += ("XGBoost4J Release Repo"
              at "https://s3-us-west-2.amazonaws.com/xgboost-maven-repo/release/")

Starting from 1.2.0, XGBoost4J-Spark supports training with NVIDIA GPUs. To enable this capability, download artifacts suffixed with -gpu, as follows:

Show instructions (Maven/SBT)

Maven

<dependencies>
  ...
  <dependency>
      <groupId>ml.dmlc</groupId>
      <artifactId>xgboost4j-gpu_2.12</artifactId>
      <version>1.2.0-RC2</version>
  </dependency>
  <dependency>
      <groupId>ml.dmlc</groupId>
      <artifactId>xgboost4j-spark-gpu_2.12</artifactId>
      <version>1.2.0-RC2</version>
  </dependency>
</dependencies>

<repositories>
  <repository>
    <id>XGBoost4J Release Repo</id>
    <name>XGBoost4J Release Repo</name>
    <url>https://s3-us-west-2.amazonaws.com/xgboost-maven-repo/release/</url>
  </repository>
</repositories>
SBT
libraryDependencies ++= Seq(
  "ml.dmlc" %% "xgboost4j-gpu" % "1.2.0-RC2",
  "ml.dmlc" %% "xgboost4j-spark-gpu" % "1.2.0-RC2"
)
resolvers += ("XGBoost4J Release Repo"
              at "https://s3-us-west-2.amazonaws.com/xgboost-maven-repo/release/")

Deprecation notices

  • Starting from this release, XGBoost requires Python 3.6 or later.
  • CUDA 10.0 or later is now required. (For Windows platform, we require CUDA 10.1.)
  • XGBoost4J and XGBoost4J-Spark have fully transitioned to Spark 3.0.0. Therefore, Scala 2.12 is now required, and Scala 2.11 is no longer supported.

TODOs

  • [x] Create a new branch release_1.2.0.
  • [x] Create Python wheels and upload to PyPI.
  • [x] Upload RC1 to our Maven repo.
  • [x] Create a tarball for the R package and upload to the Releases section
  • [x] Write release note

PRs that are back ported to release branch.

Most helpful comment

@dmlc/xgboost-committer XGBoost 1.2.0 has been now released to PyPI and our Maven repository.

@hetong007 Can we submit 1.2.0 to CRAN? Let's submit after Aug 24, when CRAN maintainers return from vacation.

@CodingCat We should make 1.1.1 and 1.2.0 available on Maven Central. Is there anything I can help?

All 24 comments

Could you please add some docs around how to do each of the listed step as XGBoost's maintainer? (so any XGBoost specific issue, like pypi account, approval etc).

@trivialfis Here are the steps for Python package:

  1. Create a new release branch.
  2. Push a commit to the branch to update the version number to RC1.
  3. Wait until the CI system builds artifacts. Currently, we use Jenkins and Travis CI to build Windows, Mac, and Linux binaries. The binaries are uploaded to https://s3-us-west-2.amazonaws.com/xgboost-nightly-builds/list.html.
  4. Now upload the binary wheels using the twine module: python -m twine upload *.whl. This will require PyPI credentials.

Got it. Thanks.

I spent a fair amount of effort to automate building release binaries. What's not automated is to upload them to distribution channels (PyPI, CRAN, Maven Central etc).

TODO: Investigate whether GPU algorithm can be enabled in the JAR artifact. If the JAR file contains CUDA code, will it work on a cluster without a GPU? I will need to test and find out.

That depends on how the library loading is done. If the GPU binary tries to dynamically load CUDA libraries then you'll get an UnsatisfiedLinkError out of the native loader. You could probe for the presence of the CUDA libraries and then conditionally load the GPU binary, otherwise load the CPU one, but that's tricky (potentially it could catch the UnsatisfiedLinkError, log it, and then say falling back to CPU, but it could fail with UnsatisfiedLinkError for other reasons e.g. lack of OpenMP, which would be confusing). This will blow up the JAR size as you'd need two copies of everything. In most Java projects the CPU and GPU artifacts are different (e.g. https://search.maven.org/search?q=g:com.microsoft.onnxruntime), but this can cause issues in downstream builds as at some point a developer has to choose whether they want a CPU or GPU binary. Fortunately in production you can just drop the GPU one higher up the classpath and it'll load just fine.

There's a stub library of CUDA linked into XGBoost so link time error should not happen. But I agree that we should have better tests.

@Craigacp The GPU algorithm uses NCCL to perform allreduce, and including the NCCL library in the JAR file increases its size to 150 MB. Does Maven Central accept this large artifact? cc @CodingCat @sperlingxx

It does. For example libtensorflow jni GPU for 1.15 is 355 MB.

@Craigacp @wbo4958 Here are the JAR files I built with GPU algorithm enabled:

To install:

mvn install:install-file -Dfile=./xgboost4j_2.12-1.2.0-RC1.jar -DgroupId=ml.dmlc \
    -DartifactId=xgboost4j_2.12 -Dversion=1.2.0-RC1 -Dpackaging=jar
mvn install:install-file -Dfile=./xgboost4j-spark_2.12-1.2.0-RC1.jar -DgroupId=ml.dmlc \
    -DartifactId=xgboost4j-spark_2.12 -Dversion=1.2.0-RC1 -Dpackaging=jar

The xgboost4j_2.12-1.2.0-RC1.jar loads just fine on Oracle Linux 7 (roughly equivalent to RHEL/CentOS 7). The error message when you try to run on GPU if there are only CPUs could probably do with prettying up a little though:
```jshell> var booster = XGBoost.train(dmatrix, params, 2, Collections.emptyMap(), null, null);
| Exception ml.dmlc.xgboost4j.java.XGBoostError: [22:26:24] /workspace/src/gbm/gbtree.cc:459: Check failed: common::AllVisibleGPUs() >= 1 (0 vs. 1) : No visible GPU is found for XGBoost.
Stack trace:
[bt] (0) /tmp/libxgboost4j12036788363652521596.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x57) [0x7fc89b2dec37]
[bt] (1) /tmp/libxgboost4j12036788363652521596.so(xgboost::gbm::GBTree::GetPredictor(xgboost::HostDeviceVector const, xgboost::DMatrix) const+0x531) [0x7fc89b3c0a91]
[bt] (2) /tmp/libxgboost4j12036788363652521596.so(xgboost::gbm::GBTree::PredictBatch(xgboost::DMatrix, xgboost::PredictionCacheEntry, bool, unsigned int)+0x32) [0x7fc89b3c0cc2]
[bt] (3) /tmp/libxgboost4j12036788363652521596.so(xgboost::LearnerImpl::UpdateOneIter(int, std::shared_ptr)+0x2c1) [0x7fc89b3f1521]
[bt] (4) /tmp/libxgboost4j12036788363652521596.so(XGBoosterUpdateOneIter+0x55) [0x7fc89b2e1785]
[bt] (5) [0x7fc960a8a5d7]

| at XGBoostJNI.checkCall (XGBoostJNI.java:48)
| at Booster.update (Booster.java:180)
| at XGBoost.trainAndSaveCheckpoint (XGBoost.java:202)
| at XGBoost.train (XGBoost.java:284)
| at XGBoost.train (XGBoost.java:112)
| at XGBoost.train (XGBoost.java:83)```

@Craigacp Did you set tree_method='gpu_hist'? You should be able to use CPU algorithm with tree_method='hist'.

@hcho3 yes, I intentionally set it to use gpu_hist to see what the failure mode was on a CPU only machine. I admit I didn't check the standard CPU algorithm in my quick test, but I assume that's still fine. The default isn't changed over to gpu_hist right?

@Craigacp No, you have to explicitly opt into gpu_hist.

Thanks, we can maybe clarify in the message about GPU being unavailable.

@hcho3 All blocking PRs are merged into master branch. I will back port them today.

I will back port them today.

Merged.

Great! I'm preparing RC2 now.

RC2 is now up. I've also uploaded JVM packages xgboost4j-gpu and xgboost4j-spark-gpu where the GPU algorithm is enabled.

Maybe change the phrasing to "CUDA 10.0 or later is required"? Same with python version?

@JohnZed Fixed.

@dmlc/xgboost-committer XGBoost 1.2.0 has been now released to PyPI and our Maven repository.

@hetong007 Can we submit 1.2.0 to CRAN? Let's submit after Aug 24, when CRAN maintainers return from vacation.

@CodingCat We should make 1.1.1 and 1.2.0 available on Maven Central. Is there anything I can help?

1.2.0 is now on CRAN.

Was this page helpful?
0 / 5 - 0 ratings