Lightgbm: [LightGBM] [Warning] No further splits with positive gain, best gain: -inf

Created on 12 Nov 2020  路  2Comments  路  Source: microsoft/LightGBM

For a data with 24 columns and 1000 records, running automl with following parameter set

    "LGBM_Estimators" : [100,200,300,400,500,600,700,800,900,1000], 
    "LGBM_Learning_Rate" : [0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0],
    "LGBM_ColByTree" : [0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0],
    "LGBM_SubSample" : [0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0],
    "LGBM_MinCWeight" : [0.001,0.002,0.003,0.004,0.005,0.006,0.007,0.008,0.009],
    "LGBM_NumLeaves" : [8,16,24,32,36,40,46,52,58,64,68,72]

throws lot of repeated below warning

[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf

May i know whats the mistake i am making?
Thanks

question

Most helpful comment

Hi @hanzigs , thanks for using LightGBM!

This doesn't mean that you've made any "mistakes", necessarily. That warning means that the boosting process has effectively ended early, because you've overfit to the training data.

The default min_data_in_leaf for LightGBM is 20 (https://lightgbm.readthedocs.io/en/latest/Parameters.html#min_data_in_leaf). That means that any split that would create a leaf node with less than 20 records in it is ignored. Setting num_leaves to values as high as 72 will grow very deep trees, and I think it's too aggressive for a dataset with only 1000 records.

If you want to avoid this in the future, you can enable early stopping by passing early_stopping_rounds and a validation set: https://lightgbm.readthedocs.io/en/latest/Python-Intro.html?highlight=early%20stopping#early-stopping. That will stop the training process after a few consecutive iterations with no gain.

I also recommend not using num_leaves values larger than 32 with such a small dataset.

All 2 comments

Hi @hanzigs , thanks for using LightGBM!

This doesn't mean that you've made any "mistakes", necessarily. That warning means that the boosting process has effectively ended early, because you've overfit to the training data.

The default min_data_in_leaf for LightGBM is 20 (https://lightgbm.readthedocs.io/en/latest/Parameters.html#min_data_in_leaf). That means that any split that would create a leaf node with less than 20 records in it is ignored. Setting num_leaves to values as high as 72 will grow very deep trees, and I think it's too aggressive for a dataset with only 1000 records.

If you want to avoid this in the future, you can enable early stopping by passing early_stopping_rounds and a validation set: https://lightgbm.readthedocs.io/en/latest/Python-Intro.html?highlight=early%20stopping#early-stopping. That will stop the training process after a few consecutive iterations with no gain.

I also recommend not using num_leaves values larger than 32 with such a small dataset.

Thanks @jameslamb
Its good now, for the automl I set the above default parameter set, so for testing, passed a small dataset and got the warnings

Was this page helpful?
0 / 5 - 0 ratings

Related issues

chivee picture chivee  路  3Comments

raphay3l picture raphay3l  路  3Comments

JoshuaC3 picture JoshuaC3  路  3Comments

heroxrq picture heroxrq  路  3Comments

hlee13 picture hlee13  路  3Comments