Xgboost: [Feature request] Arbitrary base learner

Created on 16 Jun 2020  ·  7Comments  ·  Source: dmlc/xgboost

Its pretty cool that I can define my own loss function and gradient for xgboost, and then use the linear, tree, or dart base learners to optimize my loss function.

It'd be really cool if I could specify my own base learner, perhaps in the form of an sklearn class with a fit method, a predict method, and support for sample weights.

It'd really open up a whole new world of possibilities to be able to use the Xgboost algorithm to fit a wider range of possible base learners.

feature-request

All 7 comments

@zachmayer Is StackingClassifier / StackingRegressor an option for you? We recently added support for it: #5780

Oops, my bad. When you say "base learner," you mean that you want to fit a boosted ensemble consisting of your custom models?

Yes exactly.

So for example if I wanted to boost a kernel Svm I could do that

On Tue, Jun 16, 2020 at 5:44 PM Philip Hyunsu Cho notifications@github.com
wrote:

Oops, my bad. When you say "base learner," you mean that you want a custom
model as part of the ensemble?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/dmlc/xgboost/issues/5802#issuecomment-645027884, or
unsubscribe
https://github.com/notifications/unsubscribe-auth/AAEN7VSYLUUOLIVMOYLAO63RW7RVBANCNFSM4OAABSMQ
.

Here is a related issue that has just been opened:

https://github.com/rapidsai/cuml/issues/2435

Adding AdaBoost to cuml might be a good stop-gap measure.

sklearn adaboost already supports arbitrary base learners: sklearn.ensemble.AdaBoostClassifier

XGboost is way better than adaboost though, and supports a bunch of features adaboost doesn't have:

  1. You can specify an arbitrary loss function in xgboost.
  2. You can specify a gradient for your loss function, and use the gradient in your base learner.
  3. You can specify an arbitrary evaluation function in xgboost.
  4. You can do early stopping with xgboost.
  5. You can run xgboost base learners in parallel, to mix "random forest" type learning with "boosting" type learning

@tunguz "Run adaboost on a gpu" isn't really what I'm looking for.

"Run adaboost with an arbitrary base learner, arbitrary loss function, arbitrary gradient, arbitrary evaluation, early stopping, and a mix of parallel learners (aka bagging) and boosting" would suit my needs, but that's another way to say "run xgboost with an arbitrary base learner" 😁

Just a follow up on this:

  • ngboost supports arbitrary base learners, which solves the problem for me for now.
  • There's an interesting new package called Grownet which has some evidence that boosting different weak learners (specifically neural networks) is useful. (There's a paper too)
Was this page helpful?
0 / 5 - 0 ratings

Related issues

FabHan picture FabHan  ·  4Comments

frankzhangrui picture frankzhangrui  ·  3Comments

uasthana15 picture uasthana15  ·  4Comments

trivialfis picture trivialfis  ·  3Comments

hx364 picture hx364  ·  3Comments