Problem: failed in sklearn RFECV.
catboost version: 0.23
Operating System: macOS 10.15.3
CPU: Intel Core i5
GPU: NaN
When try to sklearn's RFECV. got an error like below.
Example:
from catboost import CatBoostRegressor
from sklearn.model_selection import KFold
from sklearn.feature_selection import RFECV
rfecv = RFECV(estimator = CatBoostRegressor(),
cv = KFold(5),
scoring = 'neg_mean_squared_error')
rfecv.fit(X, y)
gives an error:
~/.pyenv/versions/3.7.7/lib/python3.7/site-packages/sklearn/feature_selection/_rfe.py in _more_tags(self)
333
334 def _more_tags(self):
--> 335 estimator_tags = self.estimator._get_tags()
336 return {'poor_score': True,
337 'allow_nan': estimator_tags.get('allow_nan', True)}
AttributeError: 'CatBoostRegressor' object has no attribute '_get_tags'
In Developing scikit-learn estimators,
These are annotations of estimators that allow programmatic inspection of their capabilities, such as sparse matrix support, supported output types and supported methods. The estimator tags are a dictionary returned by the method
_get_tags().
sklearn RFECV checkes that CatBoost allows NaN from _get_tags method.
Can you add a _get_tags method to CatBoostRegressor and CatBoostClassifier class?
I solve this problem that write this bad workaround
class CatBoostRegressor(CatBoostRegressor):
def _get_tags(self):
return {'allow_nan': True}
Your answer is great.
I could confirm the operation.
CatBoost takes a lot of time, but it's a challenge.
https://github.com/catboost/catboost/pull/1133 - here is a pr that adds this method
thanks @annaveronika !
and I'm sorry for that I missed #1133 pull request.
Most helpful comment
I solve this problem that write this bad workaround