Lightgbm: Issue when boost_from_average=True (default value) and is_unbalance=True

Created on 17 Dec 2018  路  7Comments  路  Source: microsoft/LightGBM

When boost_from_average=True (it's the default value) and is_unbalance=True, I get a low eval loss at iteration 1, then loss starts increasing, then it starts decreasing again. The problem is that, when I use early stopping, it just selects the iteration 1 because it has the lowest loss value on the eval set.

If I set boost_from_average=False, I don't get that low eval loss at the beginning, so I don't have this issue.

Most helpful comment

is_unbalance puts greater weight on the minority class (weights defined here). It's intuitive to me that this could be necessary when boost_from_average = False, but less clear how it can help if boost_from_average = True. The combination of boost_from_average = True and is_unbalance = True may be redundant in some sense.

It actually makes sense that you would see increasing loss, assuming you're minimizing the negative log likelihood. boost_from_average = True means that the very first iteration will be a good default model that matches the base rate. As soon as you start boosting with is_unbalance = True, you're putting an unfair amount of weight on the minority class, such that a split will occur to reduce error on the minority class even if it disproportionally increases error on the majority class. Subsequent iterations decrease the loss if accuracy improvements are sufficient. In the extreme case, loss approaches zero as the classifier achieves perfect separation, which can happen even with uneven weights.

Bottom line: I'd like to see what happens with boost_from_average = True and is_unbalance = False.

All 7 comments

Yeah, I almost always set boost_from_average=False for classification. It seems to cause problems when using unbalanced data.

I'm assuming your data is unbalanced and using 'is_unbalance=True' is weighting them too high. Try using scale_pos_weight instead to weight it more appropriately.

Also, for your error, you can try lowering the learning rate, increasing 'min_child_weight' or 'min_split_gain', lower 'feature_fraction' and 'bagging_fraction'. Try dart. Theres a range of parameters to use when the first iteration is the best.

The weird thing is that, when boost_from_average=True and is_unbalance=True, the eval loss value at iteration 1 is much lower than what I can reach by training my model.

Let's say that with boost_from_average=False the loss value starts at 0.65 and, after many iterations, it is about 0.35.

With boost_from_average=True it starts at 0.20, then it bumps to 0.60, then it starts decreasing again, but it can't reach 0.20 anymore after many thousands iterations.

I don't have this issue if I use class_weight='balanced' in the sklearn wrapper instead of is_unbalance=True.

Did you try a lower learning rate? I.e. 0.01, 0.001.

You should post the code if possible

ping @guolinke @zkurtz

is_unbalance puts greater weight on the minority class (weights defined here). It's intuitive to me that this could be necessary when boost_from_average = False, but less clear how it can help if boost_from_average = True. The combination of boost_from_average = True and is_unbalance = True may be redundant in some sense.

It actually makes sense that you would see increasing loss, assuming you're minimizing the negative log likelihood. boost_from_average = True means that the very first iteration will be a good default model that matches the base rate. As soon as you start boosting with is_unbalance = True, you're putting an unfair amount of weight on the minority class, such that a split will occur to reduce error on the minority class even if it disproportionally increases error on the majority class. Subsequent iterations decrease the loss if accuracy improvements are sufficient. In the extreme case, loss approaches zero as the classifier achieves perfect separation, which can happen even with uneven weights.

Bottom line: I'd like to see what happens with boost_from_average = True and is_unbalance = False.

Learning rate is already pretty low at 0.01

At the moment I'm using is_unbalance=True with boost_from_average=False.

Please note that if I use class_weight='balanced' from the sklearn wrapper instead of is_unbalance=True, I don't have the same behavior with boost_from_average=True.

I just want to add that class_weight param is supposed to be used with multiclass problem, while is_unbalance and scale_pos_weight params with binary one.

Was this page helpful?
0 / 5 - 0 ratings