I have following code, I use the following code to save the lgbmclassifier mode . But How to load saved mode:
`lgmodel = lgb.LGBMClassifier(
boosting_type='gbdt',
objective='multiclass',
learning_rate=0.01,
colsample_bytree=0.9,
subsample=0.8,
random_state=1,
n_estimators=100,
num_leaves=31)
print('begin to predict data')
lgmodel.fit(X_train, y_train)
pred_period = lgmodel.predict(X_test)`
lgbmodel.booster_.save_model('mode.txt')
and I have used the following code to load mode, but there is error:
bst = lgb.Booster(model_file='model.txt')
I don't know why you @guolinke close my issue. It does n't work for me
save model to file. it seems you miss . .
lgbmodel.booster_.save_model('mode.txt')
load from model:
bst = lgb.Booster(model_file='mode.txt')
Isn't that clear enough ?
but there is issue. I used lgb.Booster(model_file='mode.txt') to load mode to predict.
lgmodel = lgb.Booster(model_file='mode.txt')
pred_period = lgmodel.predict(X_test)`
The shape of pred_period is wrong, I want to get the shape of pred_period (x(it is the rows of pred_period data ), ), but the result is (x, y(is 14)) in my case.
Hi could you @guolinke open this issue.
I think it is a bug of lgbm
@tianke0711
I see, for the sklearn model save/load, you can use joblib.
example:
from sklearn.externals import joblib
# save model
joblib.dump(lgbmodel, 'lgb.pkl')
# load model
gbm_pickle = joblib.load('lgb.pkl')
Hi @guolinke thanks for your comment. But I want to load the model that it is not created by me. If me, I can use you method to save and load model.
@tianke0711 I think you are using sklearn interface with mulit-class classification.
It is not possible to just use the raw text model to perform prediction in this case, due to sklearn will have a transform on class label (https://github.com/Microsoft/LightGBM/blob/master/python-package/lightgbm/sklearn.py#L640-L641), which is not saved in text model file.
This code will provide the probability on all class:
lgmodel = lgb.Booster(model_file='mode.txt')
pred_period = lgmodel.predict(X_test)`
and you can use
class_index = np.argmax(pred_period, axis=1)
pred = self._le.inverse_transform(class_index)
to get the class prediction.
So without self._le, you cannot get the class mapping.
OK I got it, thanks @guolinke ! if the lgbm can improve the issue, it is the best
@tianke0711
I don't think it is a real issue.
Since your are using sklearn interface, it is better to use sklearn's solution to save/load model.
Booster.save/load is the "low level" lightgbm interface, it doesn't have duty to consider the upper level's implementation.
import lightgbm as lgb
gbm = lgb.train(params,
lgb_train,
num_boost_round=10,
valid_sets=lgb_train, # eval training data
feature_name=feature_name,
categorical_feature=[21])
gbm.save_model('model.txt')
bst = lgb.Booster(model_file='model.txt')
After I use booster_ to save model:
clf.booster_.save_model('../model/lightGBM_v1_1.txt')
I load model by:
bst = lgb.Booster(model_file='../model/lightGBM_v1_1.txt');
but bst has no method of predict_proba(), which is I really need. It taken 30 hours for me to train this model, disappointed...
@LifeRiver2017 you can predict it to the raw score via predict function, and convert it to prob by yourself
I agree with @hongbo77 . Why isn't predict_proba there? Why do I have to use _Booster to save which then results in me loading my object of type LGBMClassifier as a Booster object. Which was fine until i realized predict_proba isn't there.
I was able to solve my issue using @guolinke 's code above starting with "from sklearn.externals import joblib". Basically, don't use LightGBM's load / save functionality.
@guolinke @JonHolman try to use predict() instead of predict_proba()
So the lgbmodel.booster_.save_model method is still broken for multiclass classification? The only workaround is to use joblib? Shouldn't this issue be reopened?
Thanks @ShanLu1984 , @hongbo77 booster.predict() actually will return the probabilities.
@alexander-rakhlin I don't think it is broken. It can save/load model of multi-class, but missing the sklearn.predict function, which return the predicted class (lgb.booster.predict returns the class probabilities)
@Rpygamer @all
In prediction, I'd like to know classes name from prediction result.
How to get name of each classes after load model?
Most helpful comment
@tianke0711
I see, for the sklearn model save/load, you can use joblib.
example: