I'm using the python version of Xgboost and trying to set early stopping on AUC as follows:
param = {
'bst:max_depth':4,
'bst:eta':0.1,
'silent':1,
'objective':'binary:logistic'
}
param['nthread'] = 10
param['eval_metric'] = "auc"
param['seed'] = 0
plst = param.items()
evallist = [(dtrain_test1,'train'), (dtest2,'eval')]
num_round = 50
bst = xgb.train( plst, dtrain_test1, num_round, evallist, early_stopping_rounds = 5)
However, even though the AUC is still increasing, after 5 rounds the iteration stops:
Will train until eval error hasn't decreased in 5 rounds.
[0] train-auc:0.681576 eval-auc:0.672914
[1] train-auc:0.713940 eval-auc:0.705898
[2] train-auc:0.719168 eval-auc:0.710064
[3] train-auc:0.724578 eval-auc:0.713953
[4] train-auc:0.729903 eval-auc:0.718029
[5] train-auc:0.732958 eval-auc:0.719815
Stopping. Best iteration:
[0] train-auc:0.681576 eval-auc:0.672914
This looks to me somehow Xgboost thinks AUC should keep decreasing instead of increasing, otherwise the early stop will get triggered. Why is this the case and how to fix it?
One solution is to define your own eval metric like explained here https://github.com/tqchen/xgboost/blob/master/demo/guide-python/custom_objective.py.
And instead of computing the auc compute (-auc) this way it will decrease.
Thanks @myouness ! That's indeed a solution. Is this behavior a bug of the package?
Maybe you can try to set maximize=True, It's available in xgboost.train and xgboost.cv method
Most helpful comment
Maybe you can try to set maximize=True, It's available in xgboost.train and xgboost.cv method