Xgboost: [GPU] scalability with feature number and results reproducibility

Created on 23 Jan 2019  路  9Comments  路  Source: dmlc/xgboost

Hi, @RAMitchell and @trivialfis

I just came across the blog post by @Laurae2 (hope I didn't recognize wrongly)

https://medium.com/data-design/xgboost-gpu-performance-on-low-end-gpu-vs-high-end-cpu-a7bc5fcd425b

The blog post confirms that we have a super-fast gpu algorithm but also raises two issues

  • GPU implementation is subject to crash with high dimensional feature vector

  • there are some issues with results reproducibility of GPU hist algorithm

Would you please share some insights about (1) whether these are known issues (or they really exist); and (2) the reason for these issues (if they are there?)

Thanks

Nan

Most helpful comment

@CodingCat
For the first one, my guess is on memory limitation. And for the second one there is an open issue: #3921

That's a very detailed benchmark. @Laurae2 Huge thanks. I plan to address these problems one by one in the future.

All 9 comments

@CodingCat
For the first one, my guess is on memory limitation. And for the second one there is an open issue: #3921

That's a very detailed benchmark. @Laurae2 Huge thanks. I plan to address these problems one by one in the future.

How does the GPU updater obtain random numbers? Does it use the same mechanism as the CPU updater?

@trivialfis thanks for the response,

for the first one, I see @Laurae2 pointed out that adding more feature has 5-15% higher weights than adding samples to your dataset, it's also related to the parallelization mechanism in GPU implementation?

Does it use the same mechanism as the CPU updater?

For feature sampling, it uses ColumnSampler from /common/random.h. So should be the same.

adding more feature has 5-15% higher weights than adding samples to your dataset

GPU implementation doesn't use CSR format, instead ELLPACK is chosen. So it's not surprising.

@Laurae2 thanks for useful feedback!

Im summary here are the things we need to improve:

  • Reproducibility
  • Address crashes around using different numbers of threads/gpus
  • More test coverage for above
  • More user friendly about out of memory crashes

@RAMitchell do you also know why xgboost GPU crashes when using a too large depth even if there is GPU RAM available?

Not sure, if you have a reproducible example that would be greatly appreciated.

@RAMitchell Seems not reproducible on newer commits of xgboost.

The following used to crash on a 4GB RAM GPU in the past, but not now anymore:

library(xgboost)

set.seed(1)
N <- 10000000
p <- 100
pp <- 25
X <- matrix(runif(N * p), ncol = p)
betas <- 2 * runif(pp) - 1
sel <- sort(sample(p, pp))
m <- X[, sel] %*% betas - 1 + rnorm(N)
y <- rbinom(N, 1, plogis(m))

format(object.size(X), units = "Mb")

dtrain <- xgboost::xgb.DMatrix(X, label = y)
gc(verbose = FALSE)

set.seed(11111)
model <- xgb.train(list(objective = "binary:logistic", nthread = 1, eta = 0.10, max_depth = 13, max_bin = 255, tree_method = "hist"),
                   dtrain, nrounds = 3, verbose = 1, watchlist = list(train = dtrain))

rm(dtrain, model)
gc(verbose = FALSE)

However, the following still hangs on 2 GPU when using nthread = 1:

library(xgboost)

set.seed(1)
N <- 1000
p <- 50
pp <- 25
X <- matrix(runif(N * p), ncol = p)
betas <- 2 * runif(pp) - 1
sel <- sort(sample(p, pp))
m <- X[, sel] %*% betas - 1 + rnorm(N)
y <- rbinom(N, 1, plogis(m))

format(object.size(X), units = "Mb")

dtrain <- xgboost::xgb.DMatrix(X, label = y)
gc(verbose = FALSE)

set.seed(11111)
model <- xgb.train(list(objective = "binary:logistic", nthread = 1, eta = 0.10, max_depth = 5, max_bin = 255, tree_method = "gpu_hist", n_gpus = 2),
                   dtrain, nrounds = 3, verbose = 1, watchlist = list(train = dtrain))

rm(dtrain, model)
gc(verbose = FALSE)

close it for now as the major purpose of filing the issue (to get awareness of the blog post and more insights to the issue mentioned there) has been achieved and there is undergoing work to fix the issues

Was this page helpful?
0 / 5 - 0 ratings

Related issues

pplonski picture pplonski  路  3Comments

RanaivosonHerimanitra picture RanaivosonHerimanitra  路  3Comments

yananchen1989 picture yananchen1989  路  3Comments

trivialfis picture trivialfis  路  3Comments

wenbo5565 picture wenbo5565  路  3Comments