Xgboost: Mean absolute error?

Created on 6 Nov 2015  路  10Comments  路  Source: dmlc/xgboost

I noticed there's an objective for root mean square error but not for mean absolute error? I'm happy to put together a PR for this if you think this would be useful.

Actually what I think would be even more useful is to generalize this to quantile loss 鈥撀爃appy to include that too. My use case is I want to predict a 90% interval. This could be done by training two separate predictors, quantile loss for 0.05 and 0.95.

Let me know if you think this is a bad idea and I won't attempt it :)

python

Most helpful comment

The log-cosh objective still suffers from the gradient and hessian for very large off-target predictions being constant, therefore resulting in absence of splits. Here is a blog post on the topic, a possible solution and implementation in Python: http://www.bigdatarepublic.nl/regression-prediction-intervals-with-xgboost/

All 10 comments

This will definitely be useful. In scikit-learn's implementation, it has squared, absolute, huber and quantile loss for regression tasks. It would be nice if xgboost also support them.

I started adding this and it seems pretty straightforward. There is one problem though. xgboost computes the second derivative and for quantile loss it's a Dirac delta, i.e. zero everywhere except an infinite spike around x=0.

I'm not sure how the second derivative (Hessian) is used 鈥撀爄t doesn't seem to be used in sklearn's implementation

Closing this issue for now 鈥撀爓ill try to put together a pull request with the implementation. But would love to hear if you have any thoughts about what to do with the hessian

I looked at this some more and I have implemented something.

The problem is again with the Hessian. From what I understand, xgboost uses Newton's method to find the optimum so that's why the second derivative is needed.

I see several options

  • Instead of mean absolute error, use Huber loss
  • Instead of mean absolute error, use something like log(exp(-x) + exp(x)) which is similar
  • Use mean absolute error and make xgboost work with first order methods

I can put together a PR for Huber loss for now, but I'm not super excited about it since it requires a parameter delta. But maybe let's start there?

closing for now, also note #736 refactor gives an plugin system https://github.com/dmlc/xgboost/tree/brick/plugin to add in new loss and metrics

@erikbern I was looking for this exactly. Is there a Hubert loss equivalent for the tilted absolute loss case (i.e. when you want other quantiles besides the median?).

Even if you did get a second derivative is there a batch processing method in XgBoost, otherwise taking the Hessian might be prohibitive.

I implemented Huber but it has the same problems with convergence since the derivative isn't continuous.

I think the only solution would be to support a first-order method that doesn't require the Hessian

The twice differentiable log-cosh loss seems to be a reasonably good option for the xgboost's 2nd order framework.

Hello @tqchen,

Are plugins still the recommended way to add custom loss functions?

Are plugins only usable through the C++ frontend?
If yes, I guess I won't be able to use my custom loss in R.

I assume I should be using the example of a custom objective for R instead.

The log-cosh objective still suffers from the gradient and hessian for very large off-target predictions being constant, therefore resulting in absence of splits. Here is a blog post on the topic, a possible solution and implementation in Python: http://www.bigdatarepublic.nl/regression-prediction-intervals-with-xgboost/

Was this page helpful?
0 / 5 - 0 ratings