Shap: Support for sklearn Pipeline

Created on 6 Dec 2018  路  3Comments  路  Source: slundberg/shap

Most helpful comment

Hi,

I am wondering about the same thing since i do not get SHAP to work with sklearn pipeline for regression problems my code is as follows:

import sklearn
import pandas as pd
import shap
from sklearn.linear_model import LinearRegression, Ridge
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler, FunctionTransformer
from sklearn.decomposition import PCA

boston = sklearn.datasets.load_boston()
data = boston['data']
y = boston['target']
columns = boston['feature_names']
X = pd.DataFrame(data,columns=columns)
X_train = X[:300]
X_test = X[300:]
y_train = y[:300]
y_test = y[300:]
rs = 1

lr_pipe = Pipeline([
('SC',StandardScaler(with_mean=True)),
('pca',PCA(random_state=rs)),
('Lr', Ridge(random_state=rs))])

lr_pipe.fit(X_train,y_train)

explainer = shap.LinearExplainer(lr_pipe, X_train, feature_dependence="independent")

Exception: An unknown model type was passed:

All 3 comments

Hi,

I am wondering about the same thing since i do not get SHAP to work with sklearn pipeline for regression problems my code is as follows:

import sklearn
import pandas as pd
import shap
from sklearn.linear_model import LinearRegression, Ridge
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler, FunctionTransformer
from sklearn.decomposition import PCA

boston = sklearn.datasets.load_boston()
data = boston['data']
y = boston['target']
columns = boston['feature_names']
X = pd.DataFrame(data,columns=columns)
X_train = X[:300]
X_test = X[300:]
y_train = y[:300]
y_test = y[300:]
rs = 1

lr_pipe = Pipeline([
('SC',StandardScaler(with_mean=True)),
('pca',PCA(random_state=rs)),
('Lr', Ridge(random_state=rs))])

lr_pipe.fit(X_train,y_train)

explainer = shap.LinearExplainer(lr_pipe, X_train, feature_dependence="independent")

Exception: An unknown model type was passed:

Hi,
I came across the same problem, instead of lr_pipe you should use lr_pipe.named_steps['Lr'].
But now the problem is the next step of the process, we need the new representation of the sample in order to get the shap values (after the scaling and PCA). Any idea how to get it from the pipeline? or how to incorporate shap into the pipeline?

Hi,

I started to make my own custom function "just_transform"

Was this page helpful?
0 / 5 - 0 ratings