Xgboost: Different AUC (binary:logistic)

Created on 17 Jan 2017  路  1Comment  路  Source: dmlc/xgboost

I've faced with the different AUC calculated by xgboost and sklearn.

dtrain = xgb.DMatrix(X_train, label=y_train)
dvalidate = xgb.DMatrix(X_validate, label=y_validate)

def feval(preds, dm):
    # binary classes
    preds = [1 if item > 0.5 else 0 for item in preds]
    labels = dm.get_label()
    # from sklearn.metrics import roc_auc_score
    auc = roc_auc_score(labels, preds)
    return [('my_auc', auc)]

params = {
    'objective': 'binary:logistic',
    'eval_metric': ['auc', 'error'],
}

booster = xgb.train(
    params,
    dtrain,
    num_boost_round=100,
    evals=[(dvalidate, 'validate')],
    early_stopping_rounds=5,
    verbose_eval=True,
    feval=feval,
)

Output is:

[0] validate-auc:0.723489   validate-error:0.171567 validate-my_auc:0.521209
[1] validate-auc:0.731253   validate-error:0.171684 validate-my_auc:0.521797
[2] validate-auc:0.736759   validate-error:0.171274 validate-my_auc:0.523428
[3] validate-auc:0.74096    validate-error:0.171245 validate-my_auc:0.522919

So auc ~ 0.72 but my_auc ~ 0.52

Whats happen?

Environment info

Operating System: macOS 10.12.2 (16C67)
Package used (python/R/jvm/C++): python

xgboost version used: 0.6

If installing from source, please provide

  1. The commit hash (git rev-parse HEAD) 49ff7c1649eef4cec97c1569f1d8720f0050d72b

If you are using python package, please provide

  1. The python version and distribution Python 2.7.13 from brew
    om source

Most helpful comment

For AUC, prediction values are not supposed to be thresholded.

>All comments

For AUC, prediction values are not supposed to be thresholded.

Was this page helpful?
0 / 5 - 0 ratings