Xgboost: Check failed: base_score > 0.0f && base_score < 1.0f base_score must be in (0,1) for logistic loss'

Created on 21 Feb 2019 · 27Comments · Source: dmlc/xgboost

I keep getting this error on Windows 10 every time I try to run this code:

from xgboost import XGBClassifier 
xgb = XGBClassifier()

param_xgb = {
    'n_estimators': (10, 1000),
    'base_score': (0.01, 1, 'uniform')
}

xgb_grid = BayesSearchCV(
    estimator=xgb, search_spaces=param_xgb, scoring='recall', n_jobs=-1, cv=10)

xgb_grid.fit(X_train, y_train)

The full error outpt is this:

XGBoostError: b'[16:54:42] c:\\users\\administrator\\desktop\\xgboost\\src\\objective\\./regression_loss.h:62: Check failed: base_score > 0.0f && base_score < 1.0f base_score must be in (0,1) for logistic loss'

What am I doing wrong? It is the same error even with Sklearn's GridSearchCV

Source

PyDataBlog

Most helpful comment

@PyDataBlog So actually I did get the error. The fix seems to be changing the line

'base_score': [0.01, 1],

'base_score': [0.01, 0.99],

since the requirement is that the base score to be strictly less than 1.0.

hcho3 on 3 Jul 2019

❤1 👍1

All 27 comments

Which language are you using for Windows? There was locale issues reported for non-English Windows.

hcho3 on 25 Feb 2019

@hcho3 Danish locale. But I switch to an English keyboard and during the running of that code (if that even helps).

PyDataBlog on 25 Feb 2019

@PyDataBlog The locale bug was fixed in #3891, but it is not part of the 0.81 release. In your case, XGBoost was expecting the European notation (0,01) whereas scikit-learn was using the American notation (0.01).

Can you try using the latest source? Use https://xgboost-ci.net/job/restricted-xgboost/job/master/80/artifact/dist/linux_cuda8.0_nonccl_omp_py2.7/py/dist/xgboost-0.81-py2.py3-none-any.whl. With #3891, the American notation will be used exclusively.

hcho3 on 26 Feb 2019

@hcho3 that's great. What if I want to stick with the stable version which has the locale bug. Will it accept (0,01 , 1 , 'uniform')?

PyDataBlog on 26 Feb 2019

@PyDataBlog No, the 0.81 release won't accept (0.01, 1, 'uniform'). We are working on release 0.82 which will contain the fix. Alternatively, you can switch the system locale to "English (United States)"

hcho3 on 26 Feb 2019

@PyDataBlog We just release the new 0.82 stable release. Python wheels will be available very soon.

hcho3 on 5 Mar 2019

🎉1

@hcho3 I can't wait to have my hands on 0.82. Project has stalled waiting for the fix. Thanks for your amazing dedication.

PyDataBlog on 5 Mar 2019

@PyDataBlog It is up now. Try running pip install xgboost==0.82

hcho3 on 5 Mar 2019

❤1

@PyDataBlog It is up now. Try running pip install xgboost==0.82

Everything works as expected with the new update. Great job!

PyDataBlog on 5 Mar 2019

@hcho3 unfortunately, the same issue pops up when I try to use sklearn's GridSearchCV

```from sklearn.model_selection import GridSearchCV
xgb = XGBClassifier()

param_xgb = {
'base_score': [0.01, 1],
'learning_rate': [0.01, 1]
}

xgb_grid = GridSearchCV(estimator=xgb,
param_grid=param_xgb,
scoring='recall',
n_jobs= -1,
cv = 10)

xgb_grid.fit(X_train, y_train)
```

XGBoostError: b'[14:33:06] ..\\xgboost\\src\\objective\\./regression_loss.h:62: Check failed: base_score > 0.0f && base_score < 1.0f base_score must be in (0,1) for logistic loss'

PyDataBlog on 21 May 2019

Are you using the latest version?

hcho3 on 21 May 2019

Are you using the latest version?

yep, 0.82

PyDataBlog on 21 May 2019

whoa, 0.90 is out already?!

PyDataBlog on 21 May 2019

Can you post a toy data so that I can reproduce the problem you're experiencing?

hcho3 on 21 May 2019

Can you post a toy data so that I can reproduce the problem you're experiencing?

I just updated to 0.90. I will rerun the code and post a toy data here if the same issue persists

PyDataBlog on 21 May 2019

@hcho3 still persists. I will push a toy data

XGBoostError: [15:05:08] c:\jenkins\workspace\xgboost-win64_release_0.90\src\objective\./regression_loss.h:62: Check failed: base_score > 0.0f && base_score < 1.0f: base_score must be in (0,1) for logistic loss

PyDataBlog on 21 May 2019

@PyDataBlog Are you using English locale?

hcho3 on 21 May 2019

@PyDataBlog Are you using English locale?

yes but I do have a Danish locale installed but I don't use that locale. English is by default.

PyDataBlog on 21 May 2019

Any updates?

hcho3 on 24 May 2019

@hcho3 it still didn't work. It's the old bug resurfacing again.

PyDataBlog on 24 May 2019

Can you post your data and script?

hcho3 on 24 May 2019

👍1

@hcho3
features.zip
binary_target.zip

Here are the compressed numpy arrays for the features and target. Joblib for the arrays should be loaded after unzipping the content

import joblib
from xgboost import XGBClassifier
from sklearn.model_selection import GridSearchCV



XX_scaled = joblib.load('features.joblib')
y = joblib.load('binary_target.joblib')

xgb = XGBClassifier()

parameters_to_be_searched = {
    'base_score': [0.01, 1],
    'learning_rate': [0.01, 1],
    'n_estimatators': [1, 500],
}

xgb_grid = GridSearchCV(estimator=xgb,
                        param_grid=parameters_to_be_searched, 
                        scoring='accuracy',
                        cv=10,
                        n_jobs=-1
                   )

xgb_grid.fit(XX_scaled, y)
xgb_grid.best_score_
xgb_grid.best_estimator_

I get this error:

XGBoostError: [11:59:38] c:\jenkins\workspace\xgboost-win64_release_0.90\src\objective\./regression_loss.h:62: Check failed: base_score > 0.0f && base_score < 1.0f: base_score must be in (0,1) for logistic loss

PyDataBlog on 7 Jun 2019

@hcho3 any progress over this issue?

PyDataBlog on 2 Jul 2019

@PyDataBlog Sorry for the delay, I recently had a 2-week vacation and somehow this issue got slipped through.

I just made a clean installation of Windows 10 inside the VirtualBox (+ Miniconda) and then ran your example script. Unfortunately, I was not able to reproduce the error. Let me try again with a different system locale.

hcho3 on 3 Jul 2019

I tried the script again after changing the system locale to Danish, and still I do not see the error.

Can you post the version of scikit-learn you are using? You can get the version number by running

pip show scikit-learn

Mine shows

Name: scikit-learn
Version: 0.21.2
Summary: A set of python modules for machine learning and data mining
Home-page: http://scikit-learn.org
Author: None
Author-email: None
License: new BSD
Location: c:\users\hcho3\miniconda3\lib\site-packages
Requires: joblib, scipy, numpy
Required-by:

hcho3 on 3 Jul 2019

@PyDataBlog So actually I did get the error. The fix seems to be changing the line