Xgboost: [Feature request] Arbitrary base learner

Created on 16 Jun 2020 · 7Comments · Source: dmlc/xgboost

Its pretty cool that I can define my own loss function and gradient for xgboost, and then use the linear, tree, or dart base learners to optimize my loss function.

It'd be really cool if I could specify my own base learner, perhaps in the form of an sklearn class with a fit method, a predict method, and support for sample weights.

It'd really open up a whole new world of possibilities to be able to use the Xgboost algorithm to fit a wider range of possible base learners.

feature-request

Source

zachmayer

🚀1 ❤1

All 7 comments

@zachmayer Is StackingClassifier / StackingRegressor an option for you? We recently added support for it: #5780

hcho3 on 16 Jun 2020

Oops, my bad. When you say "base learner," you mean that you want to fit a boosted ensemble consisting of your custom models?

hcho3 on 16 Jun 2020

Yes exactly.

So for example if I wanted to boost a kernel Svm I could do that

On Tue, Jun 16, 2020 at 5:44 PM Philip Hyunsu Cho notifications@github.com
wrote:

Oops, my bad. When you say "base learner," you mean that you want a custom
model as part of the ensemble?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/dmlc/xgboost/issues/5802#issuecomment-645027884, or
unsubscribe
https://github.com/notifications/unsubscribe-auth/AAEN7VSYLUUOLIVMOYLAO63RW7RVBANCNFSM4OAABSMQ
.

zachmayer on 16 Jun 2020

❤1

Here is a related issue that has just been opened:

https://github.com/rapidsai/cuml/issues/2435

Adding AdaBoost to cuml might be a good stop-gap measure.

tunguz on 17 Jun 2020

sklearn adaboost already supports arbitrary base learners: sklearn.ensemble.AdaBoostClassifier

XGboost is way better than adaboost though, and supports a bunch of features adaboost doesn't have:

You can specify an arbitrary loss function in xgboost.
You can specify a gradient for your loss function, and use the gradient in your base learner.
You can specify an arbitrary evaluation function in xgboost.
You can do early stopping with xgboost.
You can run xgboost base learners in parallel, to mix "random forest" type learning with "boosting" type learning

zachmayer on 17 Jun 2020

👍1

@tunguz "Run adaboost on a gpu" isn't really what I'm looking for.

"Run adaboost with an arbitrary base learner, arbitrary loss function, arbitrary gradient, arbitrary evaluation, early stopping, and a mix of parallel learners (aka bagging) and boosting" would suit my needs, but that's another way to say "run xgboost with an arbitrary base learner" 😁

zachmayer on 17 Jun 2020

👍1

Just a follow up on this:

ngboost supports arbitrary base learners, which solves the problem for me for now.
There's an interesting new package called Grownet which has some evidence that boosting different weak learners (specifically neural networks) is useful. (There's a paper too)

zachmayer on 27 Jul 2020

Was this page helpful?

0 / 5 - 0 ratings