_I'm using xgboost version 4.0, compiled from source with anaconda 3 on Ubuntu 15.10._
I'm trying to train a classifier on data with 100 classes using map@5 as my evaluation metric.
However, after building 100 trees (for the 100 classes, I presume), it crashes with the following error:
[18:33:08] dmlc-core/include/dmlc/./logging.h:245: [18:33:08] src/metric/rank_metric.cc:159: Check failed: (preds.size()) == (info.labels.size()) label size predict size not match
Traceback (most recent call last):
File "troy.py", line 81, in <module>
clf = xgb.train(params, d_train, 200, evals=[(d_train,'train'),(d_valid,'eval')], early_stopping_rounds=50, verbose_eval=True)
File "/home/mikel/anaconda3/lib/python3.5/site-packages/xgboost-0.4-py3.5.egg/xgboost/training.py", line 186, in train
bst_eval_set = bst.eval_set(evals, i, feval)
File "/home/mikel/anaconda3/lib/python3.5/site-packages/xgboost-0.4-py3.5.egg/xgboost/core.py", line 811, in eval_set
ctypes.byref(msg)))
File "/home/mikel/anaconda3/lib/python3.5/site-packages/xgboost-0.4-py3.5.egg/xgboost/core.py", line 97, in _check_call
raise XGBoostError(_LIB.XGBGetLastError())
xgboost.core.XGBoostError: b'[18:33:08] src/metric/rank_metric.cc:159: Check failed: (preds.size()) == (info.labels.size()) label size predict size not match'
My code is the following:
params = {}
params['objective'] = 'multi:softprob'
params['eval_metric'] = 'map@5'
params['num_class'] = 100
params['tree_method'] = 'exact'
params['silent'] = 0
x_train, x_valid, y_train, y_valid = train_test_split(x_train, y_train, test_size=0.3, random_state=4242)
d_train = xgb.DMatrix(x_train, label=y_train)
d_valid = xgb.DMatrix(x_valid, label=y_valid)
watchlist = [(d_valid,'eval'),(d_train,'train')]
clf = xgb.train(params, d_train, 200, watchlist, early_stopping_rounds=50, verbose_eval=True)
Running it with xgboost.cv
also throws the same error, however if I set eval_metric
to mlogloss
it runs fine. It fails on both objective multi:softprob
and multi:softmax
and with eval_metric=map@5-
also
I've verified that both y_train and y_valid have 100 classes in them, here are the unique values:
[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49
50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74
75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99]
and that the number of samples in x_train & y_train as well as x_valid & y_valid are the same.
The label count in the input is equal to num_class, so why is this error thrown?
Yes I am seeing this. I added more debug statements in the code and found that the reason being preds.size () = info.labels.size() * num_classes. It looks like evaluation is broken for multi-class.
@r-tock Is there a workaround for this? eg. can I modify the preds.size or similar, or could I use a custom eval to try and solve this?
I don't think there is, I am just another xgBoost user. This is a bug and I would wait for one of xgBoost devs to come and fix it.
rank metric was not designed for multi-class classification. You can use the original multiclass evaluation metric, or write customized evaluation function to support this.
To get probability output instead of index, consider use multi:softprob as objective
@tqchen Make sense. It would be useful to add some basic constraint checking in xgboost so that these are caught when the binary starts up. Not after it has spent a ton time doing one round of boosting before it crashing in the evaluation phase.
@tqchen I think I'm misunderstanding your point here.
To get probability output instead of index, consider use multi:softprob as objective
I am using multi:softprob in this example as my objective
rank metric was not designed for multi-class classification
I'm trying to use mean average precision as my metric, which is in essence a multiclass evaluation metric.
@mxbi Based on how I parse @tqchen's response. He is implying that the current implementation of map@n metric in xgboost works only with ranking. The interface for xgboost is misleading that makes you feel like the map@n metric has a general application. You will need to implement a map@n version which works with multi-class classification or fix the current one in the code to work with multi-class.
Maybe you could try mlogloss or merror as the eval_metric.
https://github.com/dmlc/xgboost/blob/master/src/metric/multiclass_metric.cc
I have tried the "mlogloss" and "merror" with "multi:softprob" metric yet, they all throw the same error msg.
mlogloss
with multi:softprob
works fine for me though !
Well mlogloss
with multi:softmax
works fine for me too !
Most helpful comment
mlogloss
withmulti:softprob
works fine for me though !