Lightgbm: Issue when boost_from_average=True (default value) and is_unbalance=True

Created on 17 Dec 2018 · 7Comments · Source: microsoft/LightGBM

When boost_from_average=True (it's the default value) and is_unbalance=True, I get a low eval loss at iteration 1, then loss starts increasing, then it starts decreasing again. The problem is that, when I use early stopping, it just selects the iteration 1 because it has the lowest loss value on the eval set.

If I set boost_from_average=False, I don't get that low eval loss at the beginning, so I don't have this issue.

Source

ekerazha

Most helpful comment

is_unbalance puts greater weight on the minority class (weights defined here). It's intuitive to me that this could be necessary when boost_from_average = False, but less clear how it can help if boost_from_average = True. The combination of boost_from_average = True and is_unbalance = True may be redundant in some sense.

It actually makes sense that you would see increasing loss, assuming you're minimizing the negative log likelihood. boost_from_average = True means that the very first iteration will be a good default model that matches the base rate. As soon as you start boosting with is_unbalance = True, you're putting an unfair amount of weight on the minority class, such that a split will occur to reduce error on the minority class even if it disproportionally increases error on the majority class. Subsequent iterations decrease the loss if accuracy improvements are sufficient. In the extreme case, loss approaches zero as the classifier achieves perfect separation, which can happen even with uneven weights.

Bottom line: I'd like to see what happens with boost_from_average = True and is_unbalance = False.

zkurtz on 20 Dec 2018

👍3

All 7 comments

Yeah, I almost always set boost_from_average=False for classification. It seems to cause problems when using unbalanced data.

I'm assuming your data is unbalanced and using 'is_unbalance=True' is weighting them too high. Try using scale_pos_weight instead to weight it more appropriately.

Also, for your error, you can try lowering the learning rate, increasing 'min_child_weight' or 'min_split_gain', lower 'feature_fraction' and 'bagging_fraction'. Try dart. Theres a range of parameters to use when the first iteration is the best.

bbennett36 on 17 Dec 2018

The weird thing is that, when boost_from_average=True and is_unbalance=True, the eval loss value at iteration 1 is much lower than what I can reach by training my model.

Let's say that with boost_from_average=False the loss value starts at 0.65 and, after many iterations, it is about 0.35.

With boost_from_average=True it starts at 0.20, then it bumps to 0.60, then it starts decreasing again, but it can't reach 0.20 anymore after many thousands iterations.

I don't have this issue if I use class_weight='balanced' in the sklearn wrapper instead of is_unbalance=True.

ekerazha on 17 Dec 2018

Did you try a lower learning rate? I.e. 0.01, 0.001.

You should post the code if possible

bbennett36 on 18 Dec 2018

👍1

ping @guolinke @zkurtz

StrikerRUS on 20 Dec 2018

Bottom line: I'd like to see what happens with boost_from_average = True and is_unbalance = False.

zkurtz on 20 Dec 2018

👍3

Learning rate is already pretty low at 0.01

At the moment I'm using is_unbalance=True with boost_from_average=False.

Please note that if I use class_weight='balanced' from the sklearn wrapper instead of is_unbalance=True, I don't have the same behavior with boost_from_average=True.

ekerazha on 4 Jan 2019

I just want to add that class_weight param is supposed to be used with multiclass problem, while is_unbalance and scale_pos_weight params with binary one.

StrikerRUS on 10 Apr 2019

Was this page helpful?

0 / 5 - 0 ratings