Xgboost: 'label size predict size not match' when using map@k in python package

Created on 24 Apr 2016  路  11Comments  路  Source: dmlc/xgboost

_I'm using xgboost version 4.0, compiled from source with anaconda 3 on Ubuntu 15.10._

I'm trying to train a classifier on data with 100 classes using map@5 as my evaluation metric.
However, after building 100 trees (for the 100 classes, I presume), it crashes with the following error:

[18:33:08] dmlc-core/include/dmlc/./logging.h:245: [18:33:08] src/metric/rank_metric.cc:159: Check failed: (preds.size()) == (info.labels.size()) label size predict size not match
Traceback (most recent call last):
  File "troy.py", line 81, in <module>
    clf = xgb.train(params, d_train, 200, evals=[(d_train,'train'),(d_valid,'eval')], early_stopping_rounds=50, verbose_eval=True)
  File "/home/mikel/anaconda3/lib/python3.5/site-packages/xgboost-0.4-py3.5.egg/xgboost/training.py", line 186, in train
    bst_eval_set = bst.eval_set(evals, i, feval)
  File "/home/mikel/anaconda3/lib/python3.5/site-packages/xgboost-0.4-py3.5.egg/xgboost/core.py", line 811, in eval_set
    ctypes.byref(msg)))
  File "/home/mikel/anaconda3/lib/python3.5/site-packages/xgboost-0.4-py3.5.egg/xgboost/core.py", line 97, in _check_call
    raise XGBoostError(_LIB.XGBGetLastError())
xgboost.core.XGBoostError: b'[18:33:08] src/metric/rank_metric.cc:159: Check failed: (preds.size()) == (info.labels.size()) label size predict size not match'

My code is the following:

params = {}
params['objective'] = 'multi:softprob'
params['eval_metric'] = 'map@5'
params['num_class'] = 100
params['tree_method'] = 'exact'
params['silent'] = 0

x_train, x_valid, y_train, y_valid = train_test_split(x_train, y_train, test_size=0.3, random_state=4242)
d_train = xgb.DMatrix(x_train, label=y_train)
d_valid = xgb.DMatrix(x_valid, label=y_valid)
watchlist = [(d_valid,'eval'),(d_train,'train')]
clf = xgb.train(params, d_train, 200, watchlist, early_stopping_rounds=50, verbose_eval=True)

Running it with xgboost.cv also throws the same error, however if I set eval_metric to mlogloss it runs fine. It fails on both objective multi:softprob and multi:softmax and with eval_metric=map@5- also

I've verified that both y_train and y_valid have 100 classes in them, here are the unique values:

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49
 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74
 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99]

and that the number of samples in x_train & y_train as well as x_valid & y_valid are the same.

The label count in the input is equal to num_class, so why is this error thrown?

Most helpful comment

mlogloss with multi:softprob works fine for me though !

All 11 comments

Yes I am seeing this. I added more debug statements in the code and found that the reason being preds.size () = info.labels.size() * num_classes. It looks like evaluation is broken for multi-class.

@r-tock Is there a workaround for this? eg. can I modify the preds.size or similar, or could I use a custom eval to try and solve this?

I don't think there is, I am just another xgBoost user. This is a bug and I would wait for one of xgBoost devs to come and fix it.

rank metric was not designed for multi-class classification. You can use the original multiclass evaluation metric, or write customized evaluation function to support this.

To get probability output instead of index, consider use multi:softprob as objective

@tqchen Make sense. It would be useful to add some basic constraint checking in xgboost so that these are caught when the binary starts up. Not after it has spent a ton time doing one round of boosting before it crashing in the evaluation phase.

@tqchen I think I'm misunderstanding your point here.

To get probability output instead of index, consider use multi:softprob as objective

I am using multi:softprob in this example as my objective

rank metric was not designed for multi-class classification

I'm trying to use mean average precision as my metric, which is in essence a multiclass evaluation metric.

@mxbi Based on how I parse @tqchen's response. He is implying that the current implementation of map@n metric in xgboost works only with ranking. The interface for xgboost is misleading that makes you feel like the map@n metric has a general application. You will need to implement a map@n version which works with multi-class classification or fix the current one in the code to work with multi-class.

Maybe you could try mlogloss or merror as the eval_metric.
https://github.com/dmlc/xgboost/blob/master/src/metric/multiclass_metric.cc

I have tried the "mlogloss" and "merror" with "multi:softprob" metric yet, they all throw the same error msg.

mlogloss with multi:softprob works fine for me though !

Well mlogloss with multi:softmax works fine for me too !

Was this page helpful?
0 / 5 - 0 ratings

Related issues

tqchen picture tqchen  路  4Comments

XiaoxiaoWang87 picture XiaoxiaoWang87  路  3Comments

choushishi picture choushishi  路  3Comments

vkuznet picture vkuznet  路  3Comments

lizsz picture lizsz  路  3Comments