Shap: Support for sklearn Pipeline

Created on 6 Dec 2018 · 3Comments · Source: slundberg/shap

Source

saurabhhjjain

Most helpful comment

Hi,

I am wondering about the same thing since i do not get SHAP to work with sklearn pipeline for regression problems my code is as follows:

import sklearn
import pandas as pd
import shap
from sklearn.linear_model import LinearRegression, Ridge
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler, FunctionTransformer
from sklearn.decomposition import PCA

boston = sklearn.datasets.load_boston()
data = boston['data']
y = boston['target']
columns = boston['feature_names']
X = pd.DataFrame(data,columns=columns)
X_train = X[:300]
X_test = X[300:]
y_train = y[:300]
y_test = y[300:]
rs = 1

lr_pipe = Pipeline([
('SC',StandardScaler(with_mean=True)),
('pca',PCA(random_state=rs)),
('Lr', Ridge(random_state=rs))])

lr_pipe.fit(X_train,y_train)

explainer = shap.LinearExplainer(lr_pipe, X_train, feature_dependence="independent")

Exception: An unknown model type was passed:

MMron on 6 Jan 2020

👍3

All 3 comments

Hi,

I am wondering about the same thing since i do not get SHAP to work with sklearn pipeline for regression problems my code is as follows:

import sklearn
import pandas as pd
import shap
from sklearn.linear_model import LinearRegression, Ridge
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler, FunctionTransformer
from sklearn.decomposition import PCA

boston = sklearn.datasets.load_boston()
data = boston['data']
y = boston['target']
columns = boston['feature_names']
X = pd.DataFrame(data,columns=columns)
X_train = X[:300]
X_test = X[300:]
y_train = y[:300]
y_test = y[300:]
rs = 1

lr_pipe = Pipeline([
('SC',StandardScaler(with_mean=True)),
('pca',PCA(random_state=rs)),
('Lr', Ridge(random_state=rs))])

lr_pipe.fit(X_train,y_train)

explainer = shap.LinearExplainer(lr_pipe, X_train, feature_dependence="independent")

Exception: An unknown model type was passed:

MMron on 6 Jan 2020

👍3

Hi,
I came across the same problem, instead of lr_pipe you should use lr_pipe.named_steps['Lr'].
But now the problem is the next step of the process, we need the new representation of the sample in order to get the shap values (after the scaling and PCA). Any idea how to get it from the pipeline? or how to incorporate shap into the pipeline?

NogaG on 25 Mar 2020

Hi,

I started to make my own custom function "just_transform"

MMron on 26 Mar 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

How is the "BaseValue" for TreeShap computed?

resdntalien · 3Comments

"NameError: name 'Model' is not defined" when running example

samupino · 3Comments

KernelExplainer, Tensorflow / Keras, and multiple inputs

grofte · 4Comments

GradientExplainer has no expected values: how to interpret it.

1vecera · 3Comments

How to use Shap with GridsearchCV?

SSMK-wq · 4Comments