When users specify training and validation folds in the manner that the basic lightgbm.cv function accepts, this should (from what I understand work)
0%| | 0/7 [00:00<?, ?it/s]
feature_fraction, val_score: inf: 0%| | 0/7 [00:00<?, ?it/s][W 2020-08-24 15:41:09,973] Trial 0 failed because of the following error: ValueError('For early stopping, at least one dataset and eval metric is required for evaluation',)
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/optuna/study.py", line 709, in _run_trial
result = func(trial)
File "/usr/local/lib/python3.6/dist-packages/optuna/integration/_lightgbm_tuner/optimize.py", line 302, in __call__
cv_results = lgb.cv(self.lgbm_params, self.train_set, **self.lgbm_kwargs)
File "/usr/local/lib/python3.6/dist-packages/lightgbm/engine.py", line 576, in cv
evaluation_result_list=res))
File "/usr/local/lib/python3.6/dist-packages/lightgbm/callback.py", line 221, in _callback
_init(env)
File "/usr/local/lib/python3.6/dist-packages/lightgbm/callback.py", line 191, in _init
raise ValueError('For early stopping, '
ValueError: For early stopping, at least one dataset and eval metric is required for evaluation
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-18-0ec8edbe946a> in <module>()
2 label = np.array( data['target'] ).flatten())
3 tuner = lgb.LightGBMTunerCV(params, dtrain, verbose_eval=100, early_stopping_rounds=100, folds=folds)
----> 4 tuner.run()
10 frames
/usr/local/lib/python3.6/dist-packages/lightgbm/callback.py in _init(env)
189 return
190 if not env.evaluation_result_list:
--> 191 raise ValueError('For early stopping, '
192 'at least one dataset and eval metric is required for evaluation')
193
ValueError: For early stopping, at least one dataset and eval metric is required for evaluation
As well as (on the second version without early stopping, which I think is an issue that's already reported in another issue?):
0%| | 0/7 [00:00<?, ?it/s]
feature_fraction, val_score: inf: 0%| | 0/7 [00:00<?, ?it/s][W 2020-08-24 15:42:03,262] Trial 0 failed because of the following error: KeyError('l1-mean',)
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/optuna/study.py", line 709, in _run_trial
result = func(trial)
File "/usr/local/lib/python3.6/dist-packages/optuna/integration/_lightgbm_tuner/optimize.py", line 304, in __call__
val_scores = self._get_cv_scores(cv_results)
File "/usr/local/lib/python3.6/dist-packages/optuna/integration/_lightgbm_tuner/optimize.py", line 294, in _get_cv_scores
val_scores = cv_results["{}-mean".format(metric)]
KeyError: 'l1-mean'
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-21-942a30076787> in <module>()
1 tuner = lgb.LightGBMTunerCV(params, dtrain, verbose_eval=100, folds=folds)
----> 2 tuner.run()
8 frames
/usr/local/lib/python3.6/dist-packages/optuna/integration/_lightgbm_tuner/optimize.py in _get_cv_scores(self, cv_results)
292
293 metric = self._get_metric_for_objective()
--> 294 val_scores = cv_results["{}-mean".format(metric)]
295 return val_scores
296
KeyError: 'l1-mean'
!pip install lightgbm==2.3.1
import numpy as np
import pandas as pd
from sklearn.model_selection import GroupKFold
import lightgbm as lgb
lgb.__version__
np.random.seed(123)
data = pd.DataFrame({'var1': np.random.normal(loc=0, scale=1, size=100),
'var2': np.random.normal(loc=0, scale=1, size=100),
'var3': np.random.normal(loc=0, scale=1, size=100),
'testfold': np.random.choice(a=np.repeat([x for x in range(5)], 20), size=100, replace=False)})
data['target'] = 7 + 0.1*data['var1'] + 1.0*data['var2'] + 5.0*data['var3'] - 2.0*data['var1']*data['var2'] + np.random.normal(loc=0, scale=0.5, size=100)
data.head()
params = {
'objective': 'l1',
'metric': 'l1',
"verbosity": -1,
"boosting_type": "gbdt",
'seed': 1979
}
dtrain = lgb.Dataset(data= np.array( data[ ['var1', 'var2', 'var3'] ] ),
label = np.array( data['target'] ).flatten())
folds = GroupKFold().split(np.array( data[ ['var1', 'var2', 'var3'] ] ),
np.array( data['target'] ).flatten(),
np.array(data['testfold']).flatten())
lgb.cv(params, dtrain, folds=folds, verbose_eval=100) # This is how base lightgbm does this, and it works fine
!pip install optuna
import optuna.integration.lightgbm as lgb
dtrain = lgb.Dataset(data= np.array( data[ ['var1', 'var2', 'var3'] ] ),
label = np.array( data['target'] ).flatten())
tuner = lgb.LightGBMTunerCV(params, dtrain, verbose_eval=100, early_stopping_rounds=100, folds=folds)
tuner.run()
tuner = lgb.LightGBMTunerCV(params, dtrain, verbose_eval=100, folds=folds)
tuner.run()
Same issue in Kaggle kernels, but thought it would be easier to share a simplified Collab example.
Thank you for your bug report. I'm not aware of the first issue. I'll investigate it.
As well as (on the second version without early stopping, which I think is an issue that's already reported in another issue?):
I think it is the same issue as #1602. The cause is the lack of the metric mapping in LightGBMTunerCV and @thigm85 is working on it.
Yes, you are right, #1602 was indeed what I had seen before (but did not find, again).
This issue has not seen any recent activity.
Most helpful comment
Thank you for your bug report. I'm not aware of the first issue. I'll investigate it.
I think it is the same issue as #1602. The cause is the lack of the metric mapping in
LightGBMTunerCVand @thigm85 is working on it.