Shap: How to define the neighbourhood and base value for the feature selection

Created on 9 May 2018 · 4Comments · Source: slundberg/shap

1.Could you please tell me how Kernel SHAP has implemented (or selected ) the neighbourhood for instance generation in the code.

Next question is, can you please describe how you have select the feature values for the base value E[f(z)] in the code.
Don't we need to do the discretization for the selected instances or the data here?
Could you please explain how Kernal SHAP has used the training set to generate new instances?

Source

DiliSR

Most helpful comment

Good questions. Some of which are best answered by looking at a bare bones brute force implementation of Kernel SHAP: https://github.com/slundberg/shap/blob/master/notebooks/Simple%20Kernel%20SHAP.ipynb

It enumerates all subsets below a certain size that can be fully covered by the chosen number of samples. Then is randomly draws the rest of the samples from the Shapley Kernel density.
The data matrix passed when constructing the KernelExplainer represents the background. So the base value is just the average output of the model applied to this matrix.
I think checking out the notebook I linked to above would help clarify this and question 4.

slundberg on 10 May 2018

👍2

All 4 comments

It enumerates all subsets below a certain size that can be fully covered by the chosen number of samples. Then is randomly draws the rest of the samples from the Shapley Kernel density.
The data matrix passed when constructing the KernelExplainer represents the background. So the base value is just the average output of the model applied to this matrix.
I think checking out the notebook I linked to above would help clarify this and question 4.

slundberg on 10 May 2018

👍2

Thank you.
According to the basic idea of SHAP, base value is the value predicted by the model without any features. Am I right? So in the bare bones brute force implementation you have taken reference = np.zeros(M) as the base value. Am I right? Could you please tell me what is the variable in original kernel SHAP that you have used as the base value. I couldn't find that in your code.

Thank you.

DiliSR on 21 May 2018

In the Kernel SHAP implementation the base dataset is provided by the user.
So whatever gets passed as the dataset in the KernelExplainer constructor
becomes the base dataset, and the average output of the model on this
dataset becomes the base value.

On Mon, May 21, 2018 at 2:38 PM Dili notifications@github.com wrote:

Thank you.
According to the basic idea of SHAP, base value is the value predicted by
the model without any features. Am I right? So in the bare bones brute
force implementation you have taken reference = np.zeros(M) as the base
value. Am I right? Could you please tell me what is the variable in
original kernel SHAP that you have used as the base value. I couldn't find
that in your code.

Thank you.

—
You are receiving this because you commented.

Reply to this email directly, view it on GitHub
https://github.com/slundberg/shap/issues/78#issuecomment-390790838, or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADkTxXHxL_U3y9Hmt4uZN4WsGGZwYo0yks5t0zPHgaJpZM4T4Cky
.