Lightgbm: Not using all available features

Created on 4 Jun 2018  路  2Comments  路  Source: microsoft/LightGBM

Environment info

Operating System: Linux Mint Serena 18.1
CPU: AMD A10-7300, x86_64
Python version: 3.5.2
lightgbm.__version__: '2.1.1'

Issue:

LightGBM is not using all available features

print(train.shape, test.shape, y.shape)
(307493, 439) (48744, 439) (307493,)

As seen above, there are 439 features in train data but it is using only 428 features as shown in the log message:

[LightGBM] [Info] Number of positive: 24823, number of negative: 282670
[LightGBM] [Info] Total Bins 57256
[LightGBM] [Info] Number of data: 307493, number of used features: 428

Params and relevant code

cat_feature_names = ['A', 'B'... around 40 cat features]
params = {
    'task': 'train',
    'num_leaves':32,
    'min_data_in_leaf': 420,
    'application': 'binary',
    'boosting': 'gbdt',
    'metric': 'auc',
    'learning_rate': 0.01,
    'min_child_weight': 18,
    'lambda_l1': 1.5,
    'lambda_l2': 1,
    'num_threads': 3,
    }
dataset = lgb.Dataset(train, y)
model = lgb.train(params, dataset, verbose_eval=1000, categorical_feature=cat_feature_names )

The feature_fraction defaults to 1, so why is it not using all features ?

Most helpful comment

@quakig LightGBM will auto disable the feature that cannot be splitted, like the feature with almost all values are zeros (or the same). And min_data_in_leaf can control this.

All 2 comments

The number of features used seems to be inversely related to the size of min_data_in_leaf.

If i reduce it all the way down to 1, it is using 438 features (1 less than actual). Any thing greater than 1 and I am loosing more features. But it never uses all the features.

@quakig LightGBM will auto disable the feature that cannot be splitted, like the feature with almost all values are zeros (or the same). And min_data_in_leaf can control this.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

mayer79 picture mayer79  路  3Comments

MuhammedBuyukkinaci picture MuhammedBuyukkinaci  路  3Comments

Hongyun1993 picture Hongyun1993  路  3Comments

zanemarkson picture zanemarkson  路  3Comments

heroxrq picture heroxrq  路  3Comments