Optuna: LightGbm - recommendations on hyperparameters tuning

Created on 29 Mar 2020  路  4Comments  路  Source: optuna/optuna

Hi guys, I followed all of your examples regarding tuning LightGbm, however, I was hoping that perhaps some of you could share or reference some best practices and answer my questions below:

  • how many trails should the experiment run? Is a 100 typically sufficient to find a set of parameters that are 'good enough'?

  • is the default learning rate and a hundred fitting rounds 'good' for finding the best hyperparameters (assuming time is not really a huge constraint)? I'm wondering if it should run with a slightly larger learning rate to speed it up and perhaps with more boosting rounds.

  • should I limit the space search for some of these parameters to help Optuna focus on what most matters?

  • is the MedianPruner the most appropriate in this case? How many n_warmup_steps to choose?

  • https://github.com/optuna/optuna/blob/master/examples/lightgbm_tuner_simple.py - instead of running a study, I also came across this example of tuning an LightGbm model. Is there some sort of ongoing hyperparameters optimization going on "on the fly"? I'm not quite sure how best_params get updated?

I would really appreciate advice of some more seasoned Optuna users!

My current implementation looks like this. Ignore the task specific parameters, such as: 'objective':

def objective(trial):

    dtrain = lgb.Dataset(train_x, label = train_y, categorical_feature = feat_cat, free_raw_data = False)
    dtest  = lgb.Dataset(test_x, label = test_y, categorical_feature = feat_cat, free_raw_data = False)

    param = {
        'objective': 'poisson',
        'metric': 'rmse',
        'verbosity': -1,
        'boosting_type': 'gbdt',
        'force_row_wise': True,
        'max_depth': -1,

        'max_bin': trial.suggest_int('max_bin', 1, 512),
        'num_leaves': trial.suggest_int('num_leaves', 2, 512),

        'lambda_l1': trial.suggest_loguniform('lambda_l1', 1e-8, 10.0),
        'lambda_l2': trial.suggest_loguniform('lambda_l2', 1e-8, 10.0),

        'feature_fraction': trial.suggest_uniform('feature_fraction', 0.4, 1.0),
        'bagging_fraction': trial.suggest_uniform('bagging_fraction', 0.4, 1.0),
        'bagging_freq': trial.suggest_int('bagging_freq', 1, 7),

        'min_data_in_leaf': trial.suggest_int('min_data_in_leaf', 1, 50),
        'min_child_samples': trial.suggest_int('min_child_samples', 5, 100),

        'sub_feature': trial.suggest_uniform('sub_feature', 0.0, 1.0),
        'sub_row': trial.suggest_uniform('sub_row', 0.0, 1.0)
    }

    # Add a callback for pruning
    pruning_callback = optuna.integration.LightGBMPruningCallback(trial, 'rmse')

    gbm = lgb.train(
        param, 
        dtrain, 
        verbose_eval = 20,
        valid_sets = [dtest], 
        callbacks = [pruning_callback], 
        categorical_feature = feat_cat
        )

    preds = gbm.predict(test_x)
    accuracy = sqrt(sklearn.metrics.mean_squared_error(test_y, preds))

    return accuracy

if __name__ == "__main__":
    study = optuna.create_study(direction = 'minimize', pruner = optuna.pruners.MedianPruner(n_warmup_steps = 10))
    study.optimize(objective, n_trials = 100)

    print("Number of finished trials: {}".format(len(study.trials)))

    print("Best trial:")
    trial = study.best_trial

    print("  Value: {}".format(trial.value))

    print("  Params: ")
    for key, value in trial.params.items():
        print("    {}: {}".format(key, value))
question stale

Most helpful comment

Thanks @hvy, that's definitely an awesome reference I wasn't aware of!

All 4 comments

Have you read this blog post? It might be helpful addressing some of your points https://medium.com/optuna/lightgbm-tuner-new-optuna-integration-for-hyperparameter-optimization-8b7095e99258.

Thanks @hvy, that's definitely an awesome reference I wasn't aware of!

This issue has not seen any recent activity.

Let me close this issue as fixed. Please feel free to reopen as needed.

Was this page helpful?
0 / 5 - 0 ratings