I keep getting this error on Windows 10 every time I try to run this code:
from xgboost import XGBClassifier
xgb = XGBClassifier()
param_xgb = {
'n_estimators': (10, 1000),
'base_score': (0.01, 1, 'uniform')
}
xgb_grid = BayesSearchCV(
estimator=xgb, search_spaces=param_xgb, scoring='recall', n_jobs=-1, cv=10)
xgb_grid.fit(X_train, y_train)
The full error outpt is this:
XGBoostError: b'[16:54:42] c:\\users\\administrator\\desktop\\xgboost\\src\\objective\\./regression_loss.h:62: Check failed: base_score > 0.0f && base_score < 1.0f base_score must be in (0,1) for logistic loss'
What am I doing wrong? It is the same error even with Sklearn's GridSearchCV
Which language are you using for Windows? There was locale issues reported for non-English Windows.
@hcho3 Danish locale. But I switch to an English keyboard and during the running of that code (if that even helps).
@PyDataBlog The locale bug was fixed in #3891, but it is not part of the 0.81 release. In your case, XGBoost was expecting the European notation (0,01
) whereas scikit-learn was using the American notation (0.01
).
Can you try using the latest source? Use https://xgboost-ci.net/job/restricted-xgboost/job/master/80/artifact/dist/linux_cuda8.0_nonccl_omp_py2.7/py/dist/xgboost-0.81-py2.py3-none-any.whl. With #3891, the American notation will be used exclusively.
@hcho3 that's great. What if I want to stick with the stable version which has the locale bug. Will it accept (0,01 , 1 , 'uniform')?
@PyDataBlog No, the 0.81 release won't accept (0.01, 1, 'uniform'). We are working on release 0.82 which will contain the fix. Alternatively, you can switch the system locale to "English (United States)"
@PyDataBlog We just release the new 0.82 stable release. Python wheels will be available very soon.
@hcho3 I can't wait to have my hands on 0.82. Project has stalled waiting for the fix. Thanks for your amazing dedication.
@PyDataBlog It is up now. Try running pip install xgboost==0.82
@PyDataBlog It is up now. Try running
pip install xgboost==0.82
Everything works as expected with the new update. Great job!
@hcho3 unfortunately, the same issue pops up when I try to use sklearn's GridSearchCV
```from sklearn.model_selection import GridSearchCV
xgb = XGBClassifier()
param_xgb = {
'base_score': [0.01, 1],
'learning_rate': [0.01, 1]
}
xgb_grid = GridSearchCV(estimator=xgb,
param_grid=param_xgb,
scoring='recall',
n_jobs= -1,
cv = 10)
xgb_grid.fit(X_train, y_train)
```
XGBoostError: b'[14:33:06] ..\\xgboost\\src\\objective\\./regression_loss.h:62: Check failed: base_score > 0.0f && base_score < 1.0f base_score must be in (0,1) for logistic loss'
Are you using the latest version?
Are you using the latest version?
yep, 0.82
whoa, 0.90 is out already?!
Can you post a toy data so that I can reproduce the problem you're experiencing?
Can you post a toy data so that I can reproduce the problem you're experiencing?
I just updated to 0.90. I will rerun the code and post a toy data here if the same issue persists
@hcho3 still persists. I will push a toy data
XGBoostError: [15:05:08] c:\jenkins\workspace\xgboost-win64_release_0.90\src\objective\./regression_loss.h:62: Check failed: base_score > 0.0f && base_score < 1.0f: base_score must be in (0,1) for logistic loss
@PyDataBlog Are you using English locale?
@PyDataBlog Are you using English locale?
yes but I do have a Danish locale installed but I don't use that locale. English is by default.
Any updates?
@hcho3 it still didn't work. It's the old bug resurfacing again.
Can you post your data and script?
@hcho3
features.zip
binary_target.zip
Here are the compressed numpy arrays for the features and target. Joblib for the arrays should be loaded after unzipping the content
import joblib
from xgboost import XGBClassifier
from sklearn.model_selection import GridSearchCV
XX_scaled = joblib.load('features.joblib')
y = joblib.load('binary_target.joblib')
xgb = XGBClassifier()
parameters_to_be_searched = {
'base_score': [0.01, 1],
'learning_rate': [0.01, 1],
'n_estimatators': [1, 500],
}
xgb_grid = GridSearchCV(estimator=xgb,
param_grid=parameters_to_be_searched,
scoring='accuracy',
cv=10,
n_jobs=-1
)
xgb_grid.fit(XX_scaled, y)
xgb_grid.best_score_
xgb_grid.best_estimator_
I get this error:
XGBoostError: [11:59:38] c:\jenkins\workspace\xgboost-win64_release_0.90\src\objective\./regression_loss.h:62: Check failed: base_score > 0.0f && base_score < 1.0f: base_score must be in (0,1) for logistic loss
@hcho3 any progress over this issue?
@PyDataBlog Sorry for the delay, I recently had a 2-week vacation and somehow this issue got slipped through.
I just made a clean installation of Windows 10 inside the VirtualBox (+ Miniconda) and then ran your example script. Unfortunately, I was not able to reproduce the error. Let me try again with a different system locale.
I tried the script again after changing the system locale to Danish, and still I do not see the error.
Can you post the version of scikit-learn you are using? You can get the version number by running
pip show scikit-learn
Mine shows
Name: scikit-learn
Version: 0.21.2
Summary: A set of python modules for machine learning and data mining
Home-page: http://scikit-learn.org
Author: None
Author-email: None
License: new BSD
Location: c:\users\hcho3\miniconda3\lib\site-packages
Requires: joblib, scipy, numpy
Required-by:
@PyDataBlog So actually I did get the error. The fix seems to be changing the line
'base_score': [0.01, 1],
to
'base_score': [0.01, 0.99],
since the requirement is that the base score to be strictly less than 1.0.
@hcho3 thanks for the fix. Glad to have you back.
Most helpful comment
@PyDataBlog So actually I did get the error. The fix seems to be changing the line
to
since the requirement is that the base score to be strictly less than 1.0.