Xgboost: Mean absolute error?

Created on 6 Nov 2015 · 10Comments · Source: dmlc/xgboost

I noticed there's an objective for root mean square error but not for mean absolute error? I'm happy to put together a PR for this if you think this would be useful.

Actually what I think would be even more useful is to generalize this to quantile loss – happy to include that too. My use case is I want to predict a 90% interval. This could be done by training two separate predictors, quantile loss for 0.05 and 0.95.

Let me know if you think this is a bad idea and I won't attempt it :)

python

Source

erikbern

Most helpful comment

The log-cosh objective still suffers from the gradient and hessian for very large off-target predictions being constant, therefore resulting in absence of splits. Here is a blog post on the topic, a possible solution and implementation in Python: http://www.bigdatarepublic.nl/regression-prediction-intervals-with-xgboost/

arbackus on 17 May 2017

👍3

All 10 comments

This will definitely be useful. In scikit-learn's implementation, it has squared, absolute, huber and quantile loss for regression tasks. It would be nice if xgboost also support them.

yxlao on 7 Nov 2015

I started adding this and it seems pretty straightforward. There is one problem though. xgboost computes the second derivative and for quantile loss it's a Dirac delta, i.e. zero everywhere except an infinite spike around x=0.

I'm not sure how the second derivative (Hessian) is used – it doesn't seem to be used in sklearn's implementation

erikbern on 7 Nov 2015

Closing this issue for now – will try to put together a pull request with the implementation. But would love to hear if you have any thoughts about what to do with the hessian

erikbern on 7 Nov 2015

I looked at this some more and I have implemented something.

The problem is again with the Hessian. From what I understand, xgboost uses Newton's method to find the optimum so that's why the second derivative is needed.

I see several options

Instead of mean absolute error, use Huber loss
Instead of mean absolute error, use something like log(exp(-x) + exp(x)) which is similar
Use mean absolute error and make xgboost work with first order methods

I can put together a PR for Huber loss for now, but I'm not super excited about it since it requires a parameter delta. But maybe let's start there?

erikbern on 27 Nov 2015

closing for now, also note #736 refactor gives an plugin system https://github.com/dmlc/xgboost/tree/brick/plugin to add in new loss and metrics

tqchen on 15 Jan 2016

@erikbern I was looking for this exactly. Is there a Hubert loss equivalent for the tilted absolute loss case (i.e. when you want other quantiles besides the median?).

Even if you did get a second derivative is there a batch processing method in XgBoost, otherwise taking the Hessian might be prohibitive.

sachinruk on 25 Jul 2016

I implemented Huber but it has the same problems with convergence since the derivative isn't continuous.

I think the only solution would be to support a first-order method that doesn't require the Hessian

erikbern on 25 Jul 2016

The twice differentiable log-cosh loss seems to be a reasonably good option for the xgboost's 2nd order framework.

khotilov on 26 Jul 2016

👍2

Hello @tqchen,

Are plugins still the recommended way to add custom loss functions?

Are plugins only usable through the C++ frontend?
If yes, I guess I won't be able to use my custom loss in R.

I assume I should be using the example of a custom objective for R instead.

thvasilo on 23 Aug 2016

arbackus on 17 May 2017

👍3

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Is it necessary to get rid of useless features in classification task in xgboost?

yananchen1989 · 3Comments

libpath['find_lib_path']() returns absolute path for libxgboost.dll on Windows; setup install fails

Str1ker17 · 3Comments

Meaning of feature importance score

FabHan · 4Comments

Is Normalization necessary?

frankzhangrui · 3Comments

[PYTHON] Make Feature Name Validation Optional

tqchen · 4Comments