I have obtained 82% binary classification accuracy on test-set with a static set of parameters. Then, I performed GridSearch to increase the accuracy. However, the resulting accuracy was lower.
I would like to ask what would be a default hypertuning search parameters for a regular data (not unbalanced) that a newbie can use to perform an initial test for binary classification problem.
I would be glad if you can suggest a comprehensive parameter set. It would be great if the range of GridSearch parameters that you would suggest would finish less than a day using a Macbook Pro.
Check here: https://github.com/Microsoft/LightGBM/issues/695
Applies also to balanced datasets if you remove scale_pos_weight.
@Laurae2 Thanks a lot!!! I will try that. I was just about to reply to you in that issue in order to ask this.
@Laurae2 Can you give smaller ranges or step-sizes for num_leaves, max_depth as the current ones that you've given are too large. (It can take too long to tune them without a high step-size on a regular computer). For ones that you haven't recommended tuning, can you suggest default parameters if they are different than the default used by LightGBM. Also, what would be your suggestion about static ones such as boosting type? Should I just try all possible options, or are you able to say that e.g. 'dart' would be preferred over others.
@akaniklaus Never use grid search, use random search or bayesian optimization.
Use GBDT only unless you have a special need to use DART or GOSS.
@Laurae2 Thanks!!! I switched to random search as I don't know how I can use bayesian optimization. Do you have any recommendations about feature_fraction (I have 400 features and some of them can be vague or similar to each other) Also, what about lambda_l1, lambda_l2 and min_gain_to_split? Lastly, should I set min_data_in_leaf and min_sum_hessian_in_leaf parameters to one and zero because my sample data size is quite small (2000 items)? I also didn't figure out how can I enable early stopping, should I use early_stopping_round? Finally, what should be early_stopping_rounds? P.S. P.S. I actually get better validation accuracy with feature_fraction and bagging_fraction rather than subsample and colsample_bytree, probably because I am using a bag of features.
@akaniklaus you can try: